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PROTEASE M, A NOVEL SERINE PROTEASE 

Background of the Invention 
5 Under normal growth conditions, cell proliferation is tightly regulated in 

response to diverse intra-and extracellular signals. This is achieved by a complex 
network of protooncogenes and tumor-suppresser genes that are components of various 
signal transduction pathways. Activation of a protooncogene(s) and/or a loss of a tumor 
suppresser gene(s) can lead to the imregulated activity of the cell cycle machinery, ' 

10 Tumor suppresser genes can be divided into two classes. Class I, in which a loss of 
function results from a mutation or deletion and class II, in which a loss of function 
results from a regulatory block to expression (Lee et al. 1991. Proc. Natl Acad, ScL 
88:2825). Thus, both activation and loss of genes can lead to unregulated cell 
proliferation and to the accumulation of genetic errors which ultimately will result in the 

15 development of cancer (Pardee, Science 246:603-608, 1989). 

Malignancy is defined as neoplastic growth that tends to metastasize 
(Stetler-Stevenson et al. 1993 Annu, Rev. Cell BioL 9:541). Metastasis is a multistage 
process involving numerous aberrant functions of the tumor cell. These aberrant 
functions include tumor angiogenesis, attachment, adhesion to the vascular basement 

20 membrane, local proteolysis, degradation of extracellular matrix components, migration 
through the vasculature, invasion of the basement membrane, and proliferation at 
secondary sites (Poste, G. and Fidler, I.J. (1980) A^^/wre 283: 139-146; Liotta, L.A. et al. 
(1991) Cell 64:327-336). Therefore, accumulative changes in the expression of multiple 
genes probably occur before tumor cells acquire the phenotype that enables them to 

25 metastasize. The identification of genes involved in the development of the metastatic 
phenotype is essential for an understanding of the molecular mechanisms underlying 
metastasis and for the design of novel therapies designed to arrest progression of a 
primary tumor. 

Increased proteolytic potential is one documented feature of the 
30 metastatic phenotype. This increased potential is thought to result from the combined 
aberrant regulation of proteolytic enzymes (e.g., metalloproteinases and serine, cysteine 
and aspartyl proteinases) and their endogenous inhibitors (for a review, see e.g., Sloane, 
B.F. and Honn, K.V. (1984) Cancer Metastasis Rev, 3:249-263). For example, 
increased activity of serine proteases has been implicated in metastasis (Testa et al., 
35 Cancer metastasis Rev. 9:353, 1990; Dano et al., Adv. Cancer Res. 44:139, 1985; 

Ossowsky, Cancer Res. 52:6754, 1992; Sumiyoshi, Int. J. Cancer 50:345, 1992; Duffy et 
al.. Cancer Res. 50:6827, 1992; and Meissauer et al., Exp. Cell Res. 192:453, 1991. In 
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addition, other proteases have been shown to be involved in augmenting tumor cell 
invasion, such as metalloproteases (DeClerck et al.. Cancer Res 52:701, 1992), Wolf et 
aL, Proc. Nad. Acad. ScL USA 90:1843, 1993; and Sato et al.. Oncogene 7:77, 1992) - 
and cathepsins Rochefort et al.. Cancer Metast. Rev. 9:321, 1990; and Kobayashi et al., 
5 Cancer Res. 52:3610, 1992). 

Serine proteases are protein cleaving en2ymes, which contain a serine 
residue in their active sites, and which play important roles in diverse physiological 
processes, including digestion (e.g. trypsin, chymotrypsin) and blood clotting (e.g. 
plasminogen activator, thrombin) Serine proteases also act as regulators of a variety of 

1 0 processes by proteolytic activation of precursor proteins. 

The kallikreins are a sub-family of serine proteases originally defined as 
cleaving vasoactive peptides (kinins) from kininogen ( Schachter M. (1980) Pharmacol. 
Rev. 31; 1 -1 7.). Currently the kallikreins comprise a large, multi-gene family in 
rodents, although only three members of this family are known in humans. These genes 

15 clustered on chromosome 19ql3.2-ql3.4 (Reigman PH, et al. (1992) Genomics \ 4:6-1 1) 
are hKLKl, hKLK2, and hKLK3 which encode the proteins hKl (pancreatic/renal 
kallikrein), hK2 (glandular kallikrein), and hK3 (prostate specific antigen) respectively 
(Berg T, et al. (1992)^gew/^^cr2V5w^ 38 (Suppl l):19-25). 

The hKl protein is secreted from pancreas, kidney, and salivary glands 

20 (Fukushima D,et al. (1985) Biochemistry 24:8037-8043), and is the only member of the 
family having. true kallikrein activity. Its major fimction is the generation of kinins from 
kininogens and the regulation of blood pressure (Schachter , supra). 

The hK2 protein has yet to be detected in human tissue or fluids, but its 
sequence has been inferred from a genomic clone ( Schedlich LJ, et al. (1987) DNA 

25 6:429-437) as well as cDNA clones isolated from prostate hbraries ( Schedlich LJ, et al., 
{\9^1)DNA 6:429-437). hK2 expression is specific for prostate and is regulated by 
androgens (Schedlich et al. supra). Determining the fimction for this protein and 
evaluating its usefulness as a marker for prostate cancer will have to await the 
identification and isolation of the protein. 

30 The hK3 protein is PSA, the prostate specific antigen. It is produced 

predominantly in males by prostate epithelial cells and secreted into the seminal fluid 
where it serves to degrade the gel-like seminogelin protein and increase sperm motility 
(Lilja H. ( 1985) J. Clin. Invest. 76:1899-1903; Lilja H, et al. (1987) J. Clin. Invest. 
80:281 -285). Although PSA is produced at higher levels in normal than in malignant 

35 prostate tissue, a defect in the malignant tissues ultimately results in the leakage of PSA 
into the bloodstream (McCormack RT, et al. (1995) Urology 45:729), forming the basis 

^4^+U« ^T>0 A 
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Serine proteases may accomplish matrix degradation during metastases 
by activating metalloproteases (Alexander and Werb. 1 99 1 . Extracellular Matrix 
Degradation. In Cell Biology of Extracellular Matrix. Ed by Hay, E.D- New York. 
Plenum Press. 1 991 ;255). The principaF serine proteases known rxnpficated in matrix 
5 degradation mediate the plasminogen activation cascade. Included in this group are the 
urokinase plasminogen activator-receptor (uPA-uPAR) , leukocyte elastase, and tumor 
assocaited trypsin (Chen. 1992. Curr. Opin. Cell Biol. 4:802). Both uPA and tPA can 
activate serum protein plasminogen , yielding the broad-specificity protease plasmin by 
cleavage of one bond. Plasmin participates in fibrinolysis, tissue remodeling and tumor 

10 invasion (Chen, ^w/7ra). 

While proteases have been thought to promote tissue invasion and 
metastases, the development of metastatic potential appears to be more complicated. 
For example, overexpression of the protease inhibitors PAJ-1 and PAI-2, which 
negatively regulate plasminogen activator, has also been found to be assocaited with 

15 certain types of cancers (Sumiyoshi et al. 1991 Thromb. Res,, 63:59; Reilly et al. 1990. 
Biochem. Soc. Transact. 18:354). Janicke et al. have hypothesized that increased PAI-1 
secretion by tumor cells may enhance cell migration by upsetting the protease- 
antiprotease equilibrium near the cell surface of a tumor cell, perhaps via a mechanism 
involving urokinase plasminogen activator receptor clearance (Janicke et al. 1994. 

20 Cancer Res. 54:2527).. 

The identification of markers associated with the suppression of cancer, 
the development of cancer, and with the development of metastasis would be of great 
benefit. 



25 Summary of the Invention 

Disclosed herein is a novel member of the serine protease family, referred 
to as Protease M. A partial Protease M cDNA was originally identified by its 
differential expression in a primary ductal breast carcinoma and its reduced expression 
in a pleural metastasis from the same patient using the differential display method. 

30 Subsequently, a fiill-length cDNA of 1,526 nucleotides was isolated from a normal 

breast epithelial cell cDNA library and was sequenced. Expression studies indicate that 
expression of the Protease M gene is downregulated in metastatic breast cancer cell lines 
and is upregulated in primary breast cancer cell lines and ovarian cancer tissues and 
tumor cell lines. 

35 In one aspect, this invention pertains to isolated nucleic acid molecules 

comprising a nucleotide sequence encoding a Protease M protein or a biologically active 



portion thereof. In one embodiment, the invention features an isolated nucleic acid 
molecule comprising the nucleotide sequence shown in SEQ ID NO: 1. 

. In another en^bodiment the invention an isolated nucleic acid molecule of the 
present invention is at least 15 nucleotides in length and hybridizes under stringent 
conditions to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID 
NO: 1. . - 

. In yet another embodiment, a nucleic acid molecule of the present invention 
comprises the coding region of the nucleotide sequence of SEQ ID NO: 1. 

In still another embodiment the invention provides for isolated nucleic acid 
molecules which encode proteins containing amino acid sequences which are 
homologous to the sequence shown in SEQ ID NO:2. For example, in one embodiment, 
protein comprises an amino acid sequence at least 60 % homologous to the amino acid 
sequence of SEQ ID NO; 2, In another embodiment, the protein is at least about 70 %, 
preferably at least 80 % homologous, or more preferebly at least 90 % homologous to 
the amino acid sequence of SEQ.ID NO: 2. In a preferred embodiment, an isolated 
nucleic acid molecule of the invention encodes the amino acid sequence of SEQ ID NO: 
2. . ^ , 

In another embodiment, an isolated nucleic acid molecule encodes a Protease M 
fusion protein. — 

In yet another embodiment, an isolated nucleic acid molecule of the invention is 
antisense to the nucleic acid molecule of claim 1 . In a preferred embodiment, an 
isolated nucleic acid is antisense to a coding region of the coding strand of the 
nucleotide sequence of SEQ ID NO: 1. In yet another embodiment, an isolated nucleic 
acid molecule of the invention is antisense to a noncoding region of the nucleotide 
sequence of SEQ ID NO: 1 . 

In one embodunent of the invention, an isolated nucleic acid molecule 
which encodes a Proteinase M polypeptide is isoloated using at least a portion of the 
nucleotide sequence of SEQ ID NO:l as a probe or a primer. 

Another aspect of the invention pertains to vectors, e.g., recombinant 
expression vectors, containing the nucleic acid molecules of the invention. Such vectors 
can encode a protein comprising the amino acid sequence of SEQ ID NO: 2. In one 
embodiment of the invention, a vector is provided which comprises the coding region of 
the nucleotide sequence of SEQ ID NO: 1 . In one embodiment, such a host cell is used 
to produce Protease M protein by culturing the host cell in a suitable medium. If 
desired. Protease M protein can be then isolated from the medium or the host cell 
Still another aspect of the invention pertains to isolated Protease M 
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acid shown in SEQ ID No: 1 . In preferred embodiments, the Protease M protein is a 
mature polypeptide which comprises amino acids 17-244 of SEQ ID NO: 2 or aniino* 
acids 22-244. In other embodiments, the isolated Protease M protein comprises aa amino 
acid sequence at least 60 % homologous to the amino acid sequence of SEQ ID NO: 2 
5 and possesses a Protease M bioactivity in vitro. Preferably, the protein is at least 70 %, 
preferably at least 80 %, even more preferably at least 90% . In particularly preferred 
embodiments a Protease M protein of the present invention is at least about 95 % 
homologous to the amino acid sequence of SEQ ID NO: 2. 

A Protease M protein of the invention can be incorporated into a 
1 0 pharmaceutical composition comprising the protein and a pharmaceiitically acceptable 
carrier. 

Moreover, the invention provides a fusion protein comprising a Protease 
M polypeptide operatively linked to a non-Protease M polypeptide. 

: The Protease M proteins of the invention, or fragments thereof, can be 

15 used to prepare anti-Protease M antibodies. The invention provides an antigenic peptide 
- of Protease M comprising at least 8 amino acid residues of the amino acid sequence 
shown in SEQ ID NO: 2 and encompassing an epitope of Protease M such that an 
antibody raised against the peptide forms a specific immune complex with Protease M. 
Preferably, the antigenic peptide comprises at least 1 0 amino acid residues; more 

20 preferably at least 15 amino acid residues, even more preferably at least 20 amino acid 
residues, and most preferably at least 30 amino acid residues. The invention further 
provides an antibody that specifically binds Protease M.- In one embodiment, the 
antibody is monoclonal. In another embodiment, the antibody is coupled to a detectable 
label. In yet another embodiment, the antibody is incorporated into a pharmaceutical 

25 composition comprising the antibody and a pharmaceutically acceptable carrier. 

Yet another aspect of the invention pertains to transgenic non-human animals in 
which a Protease M gene has been introduced or altered. In one embodiment, the 
genome of the nonhuman animal has been altered by introduction of a nucleic acid 
molecule of the invention encoding Protease M as a transgene. In another embodiment, 

30 an endogenous Protease M gene within the genome of the nonhuman animal has been 
altered, e.g., functionally disrupted, by homologous recombination. 

Another aspect of the invention pertains to methods for detecting the 
presence or absence of Protease M in a biological sample. In a preferred embbdunent, 
the method involves contacting a biological sample (e.g., a tissue sample) with an agent 

35 capable of detecting Protease M protein or nucleic acid such that the presence of 
Protease M is detected in the biological sample. The agent can be, for example, a 
labeled or labelable nucleic acid probe capable of hybridizing to Protease M mRNA or a 




labeled or labelabie antibody-capable of binding to a Protease M protein. The invention 
further provides methods for detecting carcinomas or for staging a carcinoma based on 
detecting the presence, or absence, or amount of Protease M protein or nucleic acid in a 
test sample relative to a control sample. In one embodiment, the method involves 
5 contacting a cell or other sample from a subject with an agent capable of detecting 
Protease M protein or nucleic acid, determining the amount of Protease M protein or 
nucleic expressed in the sample, comparing the amount of Protease M protein or nucleic 
acid expressed in the sample to a control and forming a diagnosis and/or prognosis 
based on the amount of Protease M protein or nucleic acid expressed in the test sample 

10 as compared to the control sample. Preferably, the sample is mammary or ovarian 

tissue. For example, one such diagnostic method involves contacting the mRNA of a 
test cell with a nucleic acid probe containing a sequence antisense to (i.e. 
complementary to the sense strand of) a. segment of the nucleic acid sequence shown in 
SEQ ID No: 1 . Kits for detecting Protease M in a biological sample are also within the 

15 scope of the invention. 

, The Protease M protein of the invention, and other agents related thereto, 
can be used therapeutically. For example the present invention can be used to modulate 
the Protease M bioactivity associated witii a cell (e.g., in the cell, secreted by the cell or 
in the extracellular milieu surrounding the cell). Accordingly, in one embodiment, the 

20 invention provides a method for modulating the Protease M serine protease activity 
associated v^th a cell by contacting the cell with an agent that modulates Protease M 
serine protease activity. Such an agent can be, for example, a Protease M protein 
agonist or antagonist or a nucleic acid encoding a Protease M agonist or antagonist that 
has been introduced into the cell. In one embodiment. Protease M activity is stimulated 

25 in tumor cells, such as metastatic mammary tumor cells, in which endogenous Protease 
M expression is low or absent. Alternatively, in another embodiment, the invention 
provides a method for inhibiting the Protease M activity associated with a cell by 
contacting the cell with an agent that inhibits Protease M serine protease activity. Such 
an agent can be, for example, an antisense Protease M nucleic acid molecule or an anti- 

30 Protease M antibody, or Protease M antagonist, or inhibitor. The methods of the 

invention for modulating Protease M activity can be applied in vitro (e.g., to cells in 
culture) or in vivo, wherein an agent that modulates Protease M serine protease activity 
is administered to the subject. In a preferred embodiment, the invention provides a 
method for inhibiting development or progression of cancer in a cell comprising 

35 contacting a cell with an agent which modulates the amount of or activity of Protease M 
in or around the tumor cell. 
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Drug screening methods for identifying inbdulators of Protease M 
expression or Protease M serine protease activity are also encompassed by the invention. 
In one embodiment, the modulator stimulates Protease M expressioii or activity,. i.e., is 
an agonist or potentiator. In another embodiment, the modulator inhibits Protease M 
5 expression or activity, i.e., is an antagonist or inhibitor. . ; 

Other features and advantages of the invention will be apparent from the 
following detailed description, and from the claims. 



10 Brief Description of the Drawings 

Figure 1 shows the identification of Protease M (1G3) by Differential Display (DD) gel 
and northern blot (A:) DD gel: 2 1 PT and 21 MT-1 RNA was reverse transcribed with 
T12MG primer and PCR-amplified with Tj2MG and OPAl primers in the presence of 
15 35sdATP, run on a 6% acrylamide sequencing gel, and exposed to x-ray film for 18 
hours. : The portion of the gel surrounding the differentially displayed 0.28kb band is 
shown. (B.) Northern Blot: l Omg of total cell RNA was northern blotted and probed 
with 32p-iabeled PCR-amplified 0.28kb band from the DD gel shown in (A). 

20 Figure 2 shows Protease M cDN A. The cDNA sequence and putative protein coding 
sequence of the longest clone from the 76N library is shown. The postulated pre-pro N- 
terminal amino acids are underlined. The predicted cleavage sites of pre and pro amino 
acids after ala^^ and lys^^ respectively are indicated by arrows. The potential n-linked 
glycosylation site at amino acids 134-136 and asp^^^ at the bottom of the binding cleft 

25 are boxed. The residues of the catalytic triad ( his^^, asp and ser'^^) are circled. 
The actual polyadenylation signal at nucleotide 1,490 and an altemative polyadenylation 
signal at nucleotide 1,095 are underlined. . 

Figure 3 shows an alignment of Protease M with closely related members of the serine 
30 protease family. The GCG pileup and pretty plot programs were used to align Protease 
M with closely related human serine proteeises: They are from top to bottom: glandular 
kallikrein-hk2 (accession number SP|P06870|), PSA-hk3 (accession number 
SP|P07288|), pancreatic kallikrein-hk-1 (accession number SP|P20511|), and trypsinogen 
1 (accession number SP|P074771). Amino acids comprising the catalytic triad are marked 
35 with an asterisk. The 29 "invariant" amino acids (Dayhoff) are marked with a dot or an 
asterisk. 
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Figure 4 shows protease M rhRNA expression in mammary and prostate cell lines 
(A.) lOmg of total mammary cell RNA was run on an agarose/ formaldehyde gel, 
blotted and hybridized to 32p.iabeled Protease M probe and exposed to x-ray film for 20 
hours (B). I Omg of total prostate cell RNA was blotted and hybridized (as in A) and 
5 exposed to x-ray film for 20 hours. 

Figure 5 shows Protease M mRNA expression in ovarian tissue. 

lOmg of total cell RNA isolated from ovarian tissue was blotted and hybridized to 

Protease M probe (as in Figure 4) and exposed to x-ray film for 5 days. 

10 

Figure 6 shows Protease M mRNA expression in human tissue. 

A northern blot containing 2mg of polyA+ RNA from normal hiunan tissue (Clontech) 
was hybridized to Protease M probes (as in Figure 4). The blot was exposed to x-ray 
film for 2 days. 

15 

Figure 7 shows the expression of Protease M protein in mammary cell lines and insect 
cells infected with recombinant Protease M . 50mg of total cell lysate from mammary 
cell lines, uninfected msect cells (SF9) or insect cells infected with 4.5ml recombinant 
Protease M baculovirus (SF9/1G3(1)) or 22.5ml recombinant baculovirus (SF9/Protease 
20 M(2)) was run on a 12% polyacryamide/SDS gel, transferred to a PDVF membrane, and 
reacted with Protease M polyclonal anti-peptide antibody as the primary antibody and 
horseradish peroxidase conjugated anti rabbit IgG secondary antibody. Bands were 
detected with ECL detection system. 

25 Detailed Description of the Invention 

Protease M was isolated by differential display (Liang L and Pardee AB. 

(1992) Science 257:967-970; Liang L, et al.. (1993) Nucleic Acids Res. 21 :32673275; 

Sager R, et al. (1993) FASEBJ. 7: 964-970). Protease M is a novel member of the 
30 serine protease family which is most homologous to trypsin and members of the 

kallikrein family. Protease M is downregulated in metastatic breast cancer lines, but 

strongly expressed at the mRNA level in some primary breast cancer cell lines and in 

ovarian cancer tissues and tumor cell lines. 

Protease M was originally identified as being differentially expressed in a 
35 primary ductal breast carcinoma (21PT) as compared to a pleural metastasis (21MT-1) 

derived from the same patient. A full-length cDNA was subsequently isolated using the 
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partial cDNA as a hybridization probe to screen a cDNA .library ^prepared from a normal 
breast epithelial cell (76N). 

The nucleotide sequence of the isolated human Protease M cDNA, and 
the predicted amino acid sequence of the human Protease M protem- are shown in SEQ 
5 ID NOs: 1 and 2, respectively. The full length cDNA clone isolated is 1526 nucleotides 
in length and comprises 2456 base pairs of 5' nontranslated sequence, 732 base pairs of 
coding sequence, and 549 base pairs of 3' nontranslated sequence. The predicted 
Protease M protein is 244 amino acids. The NH2 terminus comprises 1 3 consecutive 
hydrophobic amino acids (leu^-ala^^)^ which is a predicted signal sequence. : The 

10 residues glu^^-glu^S. glu^^-asn^O-lys^l resemble a pro-polypeptide with a potential 
trypsin cleavage site after lys^ 1 . 

Comparison of Protease M with other known proteins showed that 
Glandular kallikrein 2 ( Schedlich LJ, et al. (1987) DNA 6:429-437; Riegman PH, et al. 
(1991) MoL Cell Endocrinol 76:181-190) has 44% exact matches and 48% match with 

15 conservative changes. Trypsin I ( Emi M, et al. (1986) Gene 41:305-310) has 43% exact 
matches and 49% match with conservative changes. Both glandular kallikrein 1 
(Fukushima D, et al, (1985) Biochemistry 24:8037-8043., Baker A, Shine J. (1985)£>A^^ 
4:445 -450; Takahashi S, Irie A, Miyake Y. ( 1988) Biochem, 404:22-29; Lu HS, et al. 
(1989) InL J. Peptide Protein Res. 33:237 -249; Angermann A, et al. ( 19S9): Biochem. 

20 J. 262: 787793) and prostate specific antigen (Watt, et al. (1986) Prpc. Natl Acad, Sci. 
USA 83:3166-3170; Lundwali A, LiljaH. (1987) FEBS Letters 214:317-322; Schaller J, 
et al. (1987) Eur. J.Biochem. 170:11 1-120; Riegman PHJ, Klaassen P, et al. (1988) 
Biochem, and Biophys, Res. Comm. 155: 1 8 1 - 188; Henttu P andVihko P. - 
(\989),Biochem, and Biophys, Res. Comm 60:903-910) have 39% exact matches and 

25 44% match with conservative changes. 

Structural features important for serine protease activity such as the 
catalytic triad (his62asnl06serl 97) cysteine bridges (Cys28-Cys^57. Cys^^-Cys^^; 
Cysl38.cys203. Cysl68.Cysl82. ^^d Cysl93.c:ys218)^ ^nd residues lining the binding 
cleft are almost perfectly conserved between Protease M and other members of the 

30 kallikrein family. The Asp residue at position 191 predicts that Protease M has a 

trypsin-like cleavage pattern. Unlike the members of the kallikrein family. Protease M 
and trypsin lack the kallikrein loop at amino acid residues 109-1 19, which is important 
for kallikrein specificity. 

Moreover, Protease M mRNA has a distinct expression pattern that 

35 distinguishes it fi-om other serine proteases. A 1.7-1.8 kb message was found to be 
normal brain, kidney, and pancreas tissue, but not in heart, placenta, lung, liver, or 
skeletal muscle. The message detected in the pancreas was only about 1.2 kb. 
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Expression studies further inclicate that expression of the Protease M gene is 
downregulated in metastatic breast cancer cell lines and is upregulated in primary breast 
cancer cell lines and ovarian cancer tissues and tumor cell lines. 

The Protease M gene was localized by FTSH analysis to chromosome 
5 1 9q 1 3 .4. The three kallikrein genes also map to chromosome 1 9q 1 3 .2-q 13,4, while 
trypsinogen 1 maps to chromosome 7. These mapping data suggest that Protease M is 
probably more closely related on an evolutionary basis to the kallikreins than to trypsin. 

The size of the detected Protease M protein is approximately 36 kD rather 
than the predicted size of 27 kD. This size discrepancy could be accounted for by 
1 0 glycosylation at asnl34 xhe expression of Protease M is regulated both at the 
transcriptional and translational level. 

Accordingly, certain aspects of the present invention relate to nucleic 
acids encoding Protease M proteins, the Protease M proteins themselves, antibodies 
immunoreactive v^th Protease M proteins, and preparations of such compositions. 
1 5 Moreover, the present invention provides diagnostic/prognostic assays and therapeutic 
reagents for detecting and treating disorders involving, for example, aberrant expression 
of Protease M or Protease M homologs. In addition, drug discovery assays are provided 
for identifying agents which can modulate the biological function of Protease M 
proteins, such as by altering the binding of Protease M molecules to proteins, including 
20 substrates. Such agents can be useful therapeutically to alter the growth and/or 

differentiation of a cell. Other aspects of the invention are described below or will be 
apparent to those skilled in the art in light of the present disclosure. 

Various aspects of the invention are described in further detail in the 
following subsections: 

25 

L Definitions 

In general, polypeptides referred to herein as having an activity of a 
Protease M protein (e.g., are "bioactive") are defined as polypeptides which include an 
amino acid sequence corresponding (e.g., identical or homologous) to all or a portion of 

30 the amino acid sequences of a Protease M protein shown in SEQ ID No:2 and which 
mimic or antagonize all or a portion of the biological^iochemical activities of a 
naturally occurring Protease M protein. Examples of such biological activity include 
serine protease activity and/or the ability to compete with a bioactivity of a naturally 
occurring Protease M. The ability of portions of Protease M to exhibit serine protease 

35 activity can be determined in standard in vitro serine protease assays, for example as 
described in detail in the appended examples. In other embodiments, a Protease M 
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molecule of the present invention is capable of modulating. the proliferation or 
metastasis of a cell, either vi/ro or /« v/vo. . , 

Other biological activities of the subject Protease M proteins are 
described herein or will be reasonably apparent to the skilled artisan. According to the 
5 present invention, a polypeptide has biological activity if it is a specific agonist or 
antagonist of a naturally-occurring form of a Protease M protein. 

"Cells," "host cells" or "recombinant host cells" are terms used 
interchangeably herein. It is understood that such terms refer not only to the particular 
subject cell but to the progeny or potential progeny of such a cell. Because certain 

1 0 modifications may occur in succeeding generations due to either mutation or 

environmental influences, such progeny may not, in fact, be identical to the parent cell, 
but are still included within the scope of the term as used herein. 

A "chimeric protein" or "fusion protein" is a fusion of a first amino acid 
sequence encoding one of the subject Protease M polypeptides v^th a second amino acid 

1 5 sequence defining a domain (e.g. polypeptide portion) foreign to and not substantially 
homologous with any domain of one of the Protease M proteins. A chimeric protein may 
present a foreign domain which is found (albeit in a different protein) in an organism 
which also expresses the first protein, or it may be an "interspecies", "intergenic", etc. 
fusion of protein structures expressed by different kinds of organisms. In general, a 

20 fusion protein can be represented by the general formula X-Protease M- Y, wherein 
Protease M represents a portion of the protein which is derived from a Protease M 
protein, and X and Y are, independently, absent or represent amino acid sequences 
which are not related to a Protease M sequence in an organism. 

As is well known, genes for a particular polypeptide may exist in single 

25 or multiple copies v^thin the genome of an individual. Such duplicate genes may be 

identical or may have certain modifications, including nucleotide substitutions, additions 
or deletions, which all still code for polypeptides having substantially the same activity. 
The term "DNA sequence encoding a Protease M polypeptide" may thus refer to one or 
more genes within a particular individual. Moreover, certain differences in nucleotide 

30 sequences may exist between individuals of the same species, which are called alleles. 
Such allelic differences may or may not result in differences in amino acid sequence of 
the encoded polypeptide yet still encode a protein with the same biological activity. 

As used herein, the term "gene" or "recombinant gene" refers to a nucleic 
acid comprising an open reading frame encoding a Protease M polypeptide of the 

35 present invention, including both exon and (optionally) intron sequences. A 

"recombinant gene" refers to nucleic acid encoding a Protease M polypeptide and 
comprising Protease M-encoding exon sequences, though it may optionally include 
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molecules (e.g., cDNA or genomic DNA) and RNA molecules (e.g., mRNA). Nucleic 
acids may be double stranded or single stranded and the term is meant to include a 
nucleic acid which is complementary (i.e., can specifically hybridize to) a nucleic acid of 
the present invention (e.g., an antisense molecule). The term "nucleic acid" as used 

. 5 herein is intended to include fragments as equivalents. The term equivalent is . 

understood to include nucleotide sequences encoding functionally equivalent Protease M 
polypeptides or functionally equivalent peptides having a bioactivity of a Protease M 
protein such as described herein. Equivalent nucleotide sequences will include 
sequences that differ by one or more nucleotide substitutions, additions or deletions, 

10 such as allelic variants; and will, therefore, include sequences that differ from the 

nucleotide sequence of the Protease M cDNA sequences shown in SEQ ID No:l due to 
the degeneracy of the genetic code. Equivalents will also include nucleotide sequences 
that hybridize under stringent conditions (i.e., equivalent to about 20-27°C below the 
melting temperature (T^) of the DNA duplex formed in about IM salt) to the nucleotide 

1 5 sequence represented in SEQ ID No: 1 . In one embodiment, equivalents will further 
include nucleic acid sequences derived from and evolutionarily related to, a nucleotide 
sequence shown in SEQ ID No: 1 . 

As used herein, the tenn "specifically hybridizes" refers to the ability of 
the probe/primer of the invention to hybridize to at least 15 consecutive nucleotides of a 

20 Protease M gene, such as a Protease M sequence designated in SEQ ID No: 1 , or a 

sequence complementary thereto, or naturally occurring mutants thereof, such that it has 
less than 15%, preferably less than 10%, and more preferably less than 5% background 
hybridization to a cellular nucleic acid (e.g., mRNA or genomic DNA) encoding a 
protein other than a Protease M protein, as defined herein. 

25 As used herein, the term "tissue-specific promoter" means a DNA 

sequence that serves as a promoter, i.e., regulates expression of a selected DNA 
sequence operably linked to the promoter, and which effects expression of the selected 
DNA sequence in specific cells of a tissue, such as cells of hepatic, pancreatic, neuronal 
or hematopoietic origin. The term also covers so-called "leaky" promoters, which 

30 regulate expression of a selected DNA primarily in one tissue, but can cause at least low 
level expression in other tissues as well. 

As used herein, the term "transfection" means the introduction of a 
nucleic acid, e.g., an expression vector, into a recipient cell by nucleic acid-mediated 
gene transfer. "Transformation", as used herein, refers to a process in which a cell's 

35 genotype is changed as a result of the cellular uptake of exogenous DNA or RNA, and, 
for example, the transformed cell expresses a recombinant form of a Protease M 
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polypeptide or, where anti-sense expression occurs from the transferred gene, the 
expression of a naturally-occurring forin of the Protease M proteiri is disrupte 

As used herein, a "transgenic animal" is any animal, preferably a non- 
human mammal, bird or an amphibian, in which one or more of the cells of the animal 
5 contain heterologous nucleic acid introduced by way of human intervention, such as by 
transgenic techniques well known in the art. The nucleic acid is introduced into the cell, 
directly or indirectly by introduction into a precursor of the cell, by way of deliberate 
genetic manipulation, such as by microinjection or by infection with a recombinant 
virus. The term genetic manipulation does not include classical cross-breeding, or in 

1 0 vitro fertilization, but rather is directed to the introduction of a recombinant DNA 
molecule. This molecule may be integrated within a chromosome, or it may be 
extrachromosomally replicating DNA. In the typicaLtransgenic animals described 
herein, the transgene causes cells to express a recombinant form of a Protease M protein. 
However, transgenic animals. in which the recombinant Protease M gene is silent are 

15 also contemplated, as for example, the FLP or CRE recombinase dependent constructs 
described below. Moreover, "transgenic animal" also includes those recombinant 
animals in which gene disruption of one or more Protease M genes is caused by human 
intervention, including both recombination and antisense techniques, 

"Transcriptional regulatory sequence" is a generic term used throughout 

20 the specification to refer to DNA sequences, such as initiation signals, enhancers, and 
. promoters, which induce or control transcription of protein coding sequences with which 
they are operably linked. In preferred embodiments, transcription of a recombinant 
Protease M gene is under the control of a promoter sequence (or other transcriptional 
regulatory sequence) which controls the expression of the recombinant gene in a cell- 

25 type in which expression is intended. It will also be xmderstood that the recombinant 

gene can be under the control of transcriptional regulatory sequences which are the same 
or which are different from those sequences which control transcription of naturally- 
occurring forms of Protease M genes. 

As used herein, the term "transgene" means a nucleic acid sequence . 

30 (encoding a Protease M polypeptide, or an antisense transcript thereto), which is partly 
or entirely heterologous, i.e., foreign, to the transgenic animal or cell into which it is 
introduced, or, is homologous to an endogenous gene of the transgenic animal or cell 
into which it is introduced, but which is designed to be inserted, or is inserted, into the 
animal's genome in such a way as to alter the genome of the cell into which it is inserted 

35 (e.g., it is inserted at a location which differs from that of the natural gene or its insertion 
results in a knockout). A transgene can include one or more transcriptional regulatory 
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sequences and any-other nucleic, acid, such as introns, that may be necessary for optimal 
expression of a selected nucleic acid. 

As used herein, the term "vector" refers to a nucleic acid molecule 
capable of transporting another nucleic acid to which it has been linked. One type of 
5 preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal 
replication. Preferred vectors are those capable of autonomous replication 
and/expression of nucleic acids to which they are linked. Vectors capable of directing 
the expression of genes to which they are operatively linked are referred to herein as 
"expression vectors". In general, expression vectors of utility in recombinant DNA 

10 techniques are often in the form of "plasmids" which refer generally to circular double 
stranded DNA loops which, in their vector form are not bound to the chromosome. In 
the present specification, "plasmid" and "vector" are used interchangeably as the plasmid 
is the most commonly used form, of vector. However, the invention is intended to 
include such other forms of expression vectors which serve equivalent functions and 

15 which become known in the art subsequently hereto. 

//. Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated nucleic acid molecules 
that encode Protease M or biologically active portions thereof, as well as nucleic acid 
20 fragments sufficient for use as hybridization probes to identify Protease M-encoding 
nucleic acid. 

A Protease M nucleic acid or a portion thereof, can be isolated using 
standard molecular biology techniques and the sequence information provided herein. 
For example, a human Protease M cDNA can be isolated from a cell Une, (e.g., a normal 

25 mammary epithelial cell line) or from a cDNA library, using all or portion of SEQ ID 
NO: 1 as a hybridization probe and standard hybridization techniques (e.g., as described 
in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory 
Manual. 2nd, ed. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989). 
Moreover, a nucleic acid molecule encompassing all or a portion of SEQ ID NO: 1 can 

30 be isolated by the polymerase chain reaction using oligonucleotide primers designed 
based upon the sequence of SEQ ID NO: 1 . For example, mRNA can be isolated from 
normal mammary epithelial cells (e.g., by the guanidinium-thiocyanate extraction 
procedure of Chirgwin et al. (1979) Biochemistry 18: 5294-5299) and cDNA can be 
prepared using reverse transcriptase (e.g., Moloney MLV reverse transcriptase, available 

35 from Gibco/BRL, Bethesda, MD; or AMV reverse transcriptase, available from 

Seikagaku America, Inc., St. Petersburg, FL). Synthetic oligonucleotide primers for 
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ID NO: 1. For example, primers suitable for amplification of a Protease M nucleic acid 
are provided in the appended Examples. A nucleic acid of tlie invention can be 
amplified using cDNA or, alternatively, genomic DN A, as a template and appropriate 
oligonucleotide primers according to standard PCR amplification techniques. The' 
5 nucleic acid so amplified can be cloned into an appropriate vector and characterized by 
DNA sequence analysis. Furthermore, oligonucleotides corresponding to Protease M 
nucleotide sequence can be prepared by standard synthetic techniques, e:g:, using an 
automated DNA synthesizer. 

In one embodiment, an isolated nucleic acid molecule of the invention 

10 comprises the nucleotide sequence shown in SEQ ID NO: Lor a fragment thereof. The 
sequence of SEQ ID NO: 1 corresponds to the human Protease M cDNA. This cDNA 
comprises sequences encoding the Protease M protein (i.e., "the coding region", from 
nucleotides 246 to 977), as well as 5' untranslated sequences (nucleotides 1 to 245) and 
3' untranslated sequences (nucleotides 978 to 1526), Alternatively, the nucleic acid 

15 molecule may comprise only the coding region of. SEQ ID NO: 1 (e.g., nucleotides 246 
to 977), for example a fragment encoding a biologically active portion of Protease M. 

In another embodiment, the Protease M nucleic acid of the present 
invention encodes the polypeptide shown in SEQ ID No:2. In another embodiment,: a 
Protease M nucleic acid encodes a biologically active portion of Protease M.' In yet 

20 another embodiment a Protease M nucleic acid encodes a mature form of Protease M'in 
which a hydrophobic, amino-terminal signal sequence (encompassing approximately 
amino acids 1-16) is absent. In a further embodiment,, a mature form of Protease M . 
preferably comprises about amino acid residues 22 to 244 (i.e.. Protease M which has 
been cleaved at a trypsin site). Although, in preferred embodiments, a nucleic acid of 

25 ^ the present invention encodes a protein in which amino, acid residue 22 is the N-terminal 
residue of the mature protein, more than one native isoform differing in the length of the 
N-terminal sequence may exist for Protease M. Consequently, the skilled artisan will 
appreciate that some flexibility exists in the N-terminus of the mature form of Protease 
M lacking a signal sequence. Additional nucleic acid fragments encoding biologically 

30 active portions of Protease M can be prepared by isolating a portion of SEQ ID NO: 1, 
expressing the encoded portion of Protease M protein or peptide (e.g., by recombinant 
expression in vitro) and assessing the bioactivity of the encoded portion of Protease M 
protein or peptide. 

In another embodiment, an isolated nucleic acid molecule of the 

35 invention is at least 15 nucleotides in length and hybridizes under stringent conditions to 
the nucleic acid molecule comprising the nucleotide sequence of SEQ ID NO: 1. In 
other embodiment, the nucleic acid is at least 30, 50, 100, 250 or 500 nucleotides in 
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length. As used herein, the term "hybridizes under stringent conditions" is intended to 
describe conditions for hybridization and washing under which nucleotide sequences at 
least 60 % homologous to each other typically remain hybridized to each^ther. 
Preferably, the conditions are such that at least sequences at least 65 %, more preferably 
5 at least 70 %, and even more preferably at least 75 % homologous to each other typically 
remain hybridized to each other. Such stringent conditions are known to those skilled in 
the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, 
N.Y. (1989), 6.3.1-6.3.6. A preferred, non-limiting example of stringent hybridization 
conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45''C, 

10 followed by one or more washes in 0.2 X SSC, 0.1% SDS at 50-65^C. Preferably, an 

isolated nucleic acid molecule of the invention that hybridizes under stringent conditions 
to the sequence of SEQ ID NO: 1 corresponds to a naturally-occurring nucleic acid 
molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to an 
RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes 

1 5 a natural protein). In one embodiment, the nucleic acid encodes a natural human 
Protease M. In another embodiment, the nucleic acid molecule encodes a. murine 
homologue of human Protease M. 

In one embodiment a Protease M nucleic acid of the present ' invention 
comprises the sequence shown in SEQ ID NO: 1 or a fragment thereof. It will be 

20 appreciated by those skilled in the art that DNA sequence polymorphisms that lead to 
changes in the amino acid sequences of Protease M may exist within a population (e.g., 
the human population). Such genetic polymorphism in the Protease M gene may exist 
among individuals within a population due to natural allelic variation. Such natural 
allelic variations can typically result in 1 -5 % variance in the nucleotide sequence of a 

25 gene. Any and all such nucleotide variations and resulting amino acid polymorphisms in 
Protease M diat are the result of natural allelic variation and that do not alter the 
functional activity of Protease M are within the scope of the invention. Moreover, 
nucleic acid molecules encoding Protease M proteins from other species, and thus which 
have a nucleotide sequence which differs from the human sequence of SEQ ID NO: 1, 

30 are intended to be within the scope of the invention. Nucleic acid molecules 

corresponding to natural allelic variants and nonhuman homologues of the human 
Protease M cDNA of the invention can be isolated based on their homology to the 
human Protease M nucleic acid disclosed herein using the human cDNA, or a portion 
thereof, as a hybridization probe according to standard hybridization techniques under 

35 stringent hybridization conditions. 

In addition to naturally-occurring allelic variants of the Protease M 
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changes may be introduced by mutation into the nucleotide sequence of SEQ ID NO: 1 , 
thereby leading to changes in the amino acid sequence of the encoded Protease M . . 
protein, without altering the functional ability of the Protease M protein, as described in 
more detail below. Accordingly, another aspect of the invention pertains to nucleic acid 
5 molecules encoding Protease M proteins that contain changes in amino acid residues that 
are not essential for Protease M activity , e.g., residues that are not conserved or only 
semi-conserved among members of the chymotrypsin family of serine proteases. Such 
Protease M proteins differ in amino acid sequence from SEQ ID NO: 2 yet retain 
Protease M bioactivity. 

10 The invention further encompasses nucleic acid molecules that differ 

from SEQ ID NO:l (and portions thereof) due to degeneracy of the genetic code. In 
another embodiment, the isolated nucleic acid molecule comprises a nucleotide sequence 
encoding a protein, wherein the protein comprises an amino acid sequence at least 60 % 
homologous to the amino acid sequence of SEQ ID NO: 2 and exhibits serine protease 

15 activity in vitro. Preferably, the protein encoded by the nucleic acid molecule is at least 
70 % homologous to SEQ ID NO: 2, more preferably at least 80 % homologous to SEQ 
ID NO: 2, even more preferably at least 90 % homologous to SEQ ID NO: 2: In a 
particularly preferred embodiment a Protease M nucleic, acid of the present invention is 
at least about 95 % homologous to SEQ ID NO: 2. 

20 To determine the percent homology of two amino acid sequences (e.g., 

SEQ ID NO: 2 and a mutant fomi thereof), the sequences are aligned for optimal ; 
comparison purposes (e.g., gaps may be introduced in the sequence of one protein for 
optimal alignment with the other protein). The cimino acid residues at corresponding 
amino acid positions are then compared. When a position in one sequence (e.g., SEQ ID 

25 NO: 2) is occupied by the same amino acid residue as the corresponding position in the 
other sequence (e.g., a mutant form of Protease M), then the molecules are homologous 
at that position (i.e., as used herein amino acid "homology" is equivalent to amino acid 
"identity"). The percent homology between the two sequences is a function of the 
number of identical positions shared by the sequences (i.e., % homology = # of identical 

30 positions/total # of positions x 1 00). 

An isolated nucleic acid molecule encoding a Protease M protein 
homologous to the protein of SEQ ID NO: 2 can be created by introducing one or more 
nucleotide substitutions, additions or deletions into the nucleotide sequence of SEQ ID 
NO: 1 such that one or more amino acid substitutions, additions or deletions are 

35 introduced into the encoded protein, as detailed below. 

In addition to the nucleic acid molecules encoding Protease M proteins 
described above, another aspect of the invention pertains to isolated nucleic acid 
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molecules which are antisense thereto. An "antisense" nucleic acid comprises a 
nucleotide sequence which is complementaiy to a "sense" nucleic acid encoding a 
protein, e.g., compleineatary. to the coding strand of a double-stranded cDN A. molecule 
or complementary to an mRNA sequence. Accordingly, an antisense nucleic acid can 
hydrogen bond to a sense nucleic acid. 

The antisense nucleic acid can be complementaiy to an entire Protease M 
coding strand, or to only a portion thereof. In one embodiment, an antisense nucleic 
acid molecule is antisense to a "coding region" of the coding strand of a nucleotide 
sequence encoding Protease M. The term "coding region" refers to the region of the 
nucleotide sequence comprising codons which are translated into amino acid residues 
(e.g., the entire coding region of SEQ ID NO: 1 comprises nucleotides 246 to 977). In 
another embodiment, the antisense nucleic acid molecule is antisense to a "noncoding 
region" of the coding strand of a nucleotide sequence encoding Protease M. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are 
not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding Protease M disclosed herein 
(e.g., SEQ ID NO: 1), antisense nucleic acids of the invention can be designed according 
to the rules of -Watson.and Crick base pairing. Preferably is an oligonucleotide which is 
antisense to only a portion of the coding or noncoding region of Protease M mRNA. For 
example, the antisense oligonucleotide may be complementary to the region surrounding 
the translation start site of Protease M mRNA. An antisense oligonucleotide can be, for 
example, about 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense ' 
nucleic acid of the invention can be constructed using chemical synthesis arid enzymatic 
ligation reactions using procedures known in the art. For example, an antisense nucleic 
acid (e.g., an antisense oligonucleotide) can be chemically synthesized using naturally 
occuTTing nucleotides or variously modified nucleotides designed to increase the 
biological stability of the molecules or to increase the physical stability of the duplex 
formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives 
and acridine substituted nucleotides can be used. Alternatively, the antisense nucleic 
acid can be produced biologically using an expression vector into which a nucleic acid 
has been subcloned in an antisense. orientation (i.e., RNA transcribed from the inserted 
nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 

In another embodiment, an antisense nucleic acid of the invention is a ■ 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity which are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they 
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nucleic acid can be designed based upon the nucleotide sequence of a Protease M gDNA 
disclosed herein (i.e., SEQ ID NO: 1). For example, a derivative of a Tetrahymena L-19 
IV S RNA can be constructed in which the base sequence of the active site is . 
complementaiy to the base sequence to be cleaved in a Protease M-encoding mRNA. 
5 See for example Cech et al. U.S. Patent No. 4,987,071 ; and Cech et al. U.S. Patent No. 
5,1 16,742. Alternatively, Protease M mRNA can be used to select a catalytic RNA 
having a specific ribonuclease activity from a pool of RNA molecules. See for example 
Bartel, D. and Szostak, J.W. (1993) Science 261 : 141 1-1418. 

1 0 III Recombinant Expression Vectors and Host Cells 

The recombinant expression vectors of the invention comprise a nucleic 
acid of the invention in a form suitable for expression of the nucleic acid in a host cell, 
w^hich means that the recombinant expression vectors include one or more regulatory 
sequences, selected on the basis of the host cells to be used for expression, which is 

15 operatively linked to the nucleic acid sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 
of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a 
host cell when the vector is introduced into the host cell). The term "regulatory 

20 sequence" is intended to includes promoters, enhancers and other expression control 
elements (e.g., polyadenylation signals). Such regulatory sequences are described, for 
example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, CA (1990). Regulatory sequences include those which 
direct constitutive expression of a nucleotide sequence in many types of host cell and 

25 those which direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art 
that the design of the expression vector may depend on such factors as the choice of the 
host cell to be transformed, the level of expression of protein desired, etc. The 
expression vectors of the invention can be introduced into host cells to thereby produce 

30 proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as 
described herein (e.g.. Protease M proteins, mutant forms of Protease M, fusion proteins, 
etc.). . • . . 

The recombinant expression vectors of the invention can be designed for 
expression of Protease M in prokaryotic or eukaryotic cells. For example. Protease M 

35 can be expressed in bacterial cells such as E. coli or insect cells (using baculovirus 
expression vectors) as described in detain in the appended Examples. Other possible 
host cells include yeast cells or mammalian cells. Suitable host cells are discussed 
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further in Goeddel, Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, CA (1990). Alternatively, the recombinant expression 
vector may be transcribed and translated in viiro, for example using T7 promoter 
regulatory sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in co// 
with vectors containing constitutive or inducible promoters directing the expression of 
either fusion or non-fusion proteins: Fusion vectors add a number of amino acids to a 
protein encoded therein, usually to the amino terminus of the recombinant protein. Such 
fusion vectors typically serve three purposes: 1) to increase expression of recombinant 
protein; 2) to increase the solubility of the recombinant protein; and 3) to aid in the 
purification of the recombinant protein by acting as a ligand in affinity purification. 
Often, in fusion expression vectors, a proteol3^ic cleavage site is introduced at the 
junction of the fusion moiety and the recombinant protein to enable separation of the 
recombinant protein from the fusion moiety subsequent to purification of the fusion 
protein. Such enzymes, and their cognate recognition sequences, include Factor Xa, 
thrombin and enterokinase. Typical fusion expression vectors include pGEX' 
(Pharmacia Biotech Inc; Smith, D.B. and Johnson, K.S. (1988) Gene 67:31-40), pMAL 
(New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, Piscataway, NJ) which 
fuse glutathione S-transferase (GST), rhaltose E binding protein, or protein A, 
respectively, to the target recombiiiant protein. In a preferred embodiment, exemplified 
herein, the coding sequence of the mature form of Protease M (i.e., encompassing amino 
acids 22-244) is cloned into a pGEX-2t expression vector to create a vector encoding a 
fusion protein which was solubilized from bacteria and purified on glutathionine agarose 
beads by standard methods (Smith DB, and Johnson. 1988. Gene 67:31), 

Examples of suitable inducible non-fusion E. coli expression vectors 
include pTrc (Amann et aL, (1988) Gene 69:301-315) and pET 1 Id (Studier et aL, Gene 
Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
California (1990) 60-89). Target gene expression from the pTrc vector relies on host 
RNA polymerase transcription from a hybrid trp-lac fiision promoter. Target gene 
expression from the pET 1 Id vector relies on transcription from a T7 gnlO-lac fusion 
promoter mediated by a coexpressed viral RNA polymerase (T7 gnl ). This viral 
polymerase is supplied by host strains BL21(DE3) or HMSl 74(DE3) from a resident X 
prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 
promoter. 

One strategy to maximize recombinant protein expression in E. coli is to 
express the protein in a host bacteria with an impaired capacity to proteolytically cleave 
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Enzymology 185, Academic Press, San Diego, California (-1990) 1 19-128). Another 
strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an . 
expression vector so that the individual codons for each amino acid are those 
preferentially utilized in E. coli (Wada et al., (1992} Nuc. Acids Res. 20:21 1 1 -2 1 18). 
5 Such aheration of nucleic acid sequences of the invention can be carried out by standard 
DN A synthesis techniques. 

In another embodiment, the Protease M expression vector is a yeast 
expression vector. Examples of vectors for expression in yeast S. cerivisae include 
pYepSecl (Baldari, et al., (1987) Embo J. 6:229-234), pMFa (Kurjan and Herskowitz, 

10 (1982) Cell 30:933-943), pJRY88 (Schultz et al., {\9^1) Gene 54:1 13-123), and pYES2 
(Invitrogen Corporation, San Diego, CA). 

Alternatively, Protease M can be expressed in insect cells using 
baculovirus expression vectors as described herein. Baculovirus vectors available for 
expression of proteins in cultured insect cells (e.g., Sf 9 cells) include the pAc series 

15 (Smith et al.,.(1983) MoL Cell Biol 3:2156-2165) and the pVL series (Lucklow, v!a., 
and Sunimers,M.D., (1989) K/ro/ogy 170:31-39). . , . 

In yet another embodiment, a nucleic acid of the invention is expressed in 
mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, B., (1987) Nature 329:840) and pMT2PC 

20 (Kaufman et al. (1 987), EMBO J. 6: 1 87-1 95). When used in mammalian cells, the ... 
expression vector's control functions are often. provided by viral regulatory elements. 
For example, conmionly used promoters are deriyed from polyoma. Adenovirus 2, . 
cytomegalovirus and Simian Virus 40. . 

In another embodiment, the recombinant mammalian expression vector is 

25 capable of directing expression of the nucleic acid preferentially in a . particular cell type 
(e.g., tissue-specific regulatory elements are used to express the nucleic acid). Tissue- 
specific regulatory elements are known in the art. Non-limiting examples of suitable 
tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. 
(1987) Genes Dey. .\:268-277\ lymphoid-specific promoters (Galame and Eaton (1988) 

30 Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and 

Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 
33:729-740; Queen and Baltimore (1983) C^// 33:741-748), neuron-specific promoters 
(e.g., the neurofilament promoter; Byrne and Ruddle (1989) Proc. Nail. Acad. Sci. USA 
86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science 230:912-9161 

35 and mammary gland-specific promoters (e.g., milk whey promoter; U.S. Patent No. 
4,873,316 and European Application Publication No. 264,166). Developmentally- 
regulated promoters are also encompassed, for example the murine hox promoters 
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(Kessel and Gruss (1990) Science 249:374-379) and the a-fetoprotein promoter (Campes 
and Tilghman (1 989) Genes Dev. 3 :537-546). 

. The invention further provides a recombinant expression vector 
comprising a DNA molecule of the invention cloned into the expression vector in an 
5 antisense orientation. That is, the DNA molecule is operatively linked to a regulatory 
sequence in a manner which allows for expression (by transcription of the DNA 
molecule) of an RNA molecule which is antisense to Protease M mRNA. Regulatory 
sequences operatively linked to a nucleic acid cloned in the antisense orientation can be 
chosen which direct the continuous expression of the antisense RNA molecule in a 

1 0 variety of cell types, for instance viral promoters and/or enhancers, or regulatory 

sequences can be chosen which direct constitutive, tissue specific or cell type specific 
expression of antisense RNA. The antisense expression vector can be in the form of a 
recombinant plasmid, phagemid or attenuated virus in which antisense nucleic acids are 
produced under the control of a high efficiency regulatory region, the activity of which 

1 5 can be determined by the cell type into which the vector is introduced. For a discussion 
of the regulation of gene expression using antisense genes see Weintraub, H. et ah, 
Antisense RNA as a molecular tool for genetic analysis. Reviews - Trends in Genetics, 
Vol. 1(1) 1986. 

Another aspect of the invention pertains to recombinant host cells into 

20 which a recombinant expression vector of the invention has been introduced. The terms 
"host cell" and "recombinant host cell" are used interchangeably herein. It is understood 
that such terms refer not only to the particular subject cell but to the progeny or potential 
progeny of such a cell. Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences, such progeny may not, 

25 in fact, be identical to the parent cell, but are still included within the scope of the term 
as used herein. - 

A host cell may be any prokaryotic or eukaryotic cell. For example, 
Protease M protein may be expressed in bacterial cells such as E, coli, insect cells, yeast 
or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other 

30 suitable host cells are known to those skilled in the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via 
conventional transformation or transfection techniques. As used herein, the terms 
"transformation" and "transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including 

35 calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 
transfection, lipofection, or electroporation. Suitable methods for transforming or 

J. r — J.:. — i_ — J. — M _ 1 — r- j . .• \ / t r i t y-u . ^. 
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Laboratory Manual^ 2nd Edition, Cold Spring Harbor Laboratory press (1 989)), and 
other laboratory manuals. 

For stable transfection of maiiunalian cells, it is known that, depending 
upon the expression vector and transfection technique used, only a small fraction of cells 
5 may integrate the foreign DNA into their genome. In order to identify and, select these 
integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Preferred 
selectable markers include those which confer resistance to drugs, such as G41 8, 
hygromycin and methotrexate. Nucleic acid encoding a selectable marker may be 

10 introduced into a host cell on the same vector as that encoding protease M or may be 
introduced on a separate vector. Cells stably transfected with the introduced nucleic 
acid can be identified by drug selection (e.g., cells that have incorporated the selectable, 
marker gene will survive, while the other cells die); , . ^ 

, A host cell of the invention, such as a prokaryotic or eukaryotic host cell 

15 in culture, can be used to produce (i.e., express). Protease M protein. Accordingly, the . 
invention further provides methods for producing Protease M protein using the host cells 
of the invention. In one embodiment, the method comprises culturing the host cell of , 
invention (into which a recombinant expression vector encoding Protease M has been 
introduced) in a suitable medium until Protease M is produced. In another embodiment, 

20 the method further comprises isolating Protease M from the medium or the host cell. 

IV. Protease M Proteins . - 

Another aspect of the invention pertains to isolated Protease M proteins, 
and biologically active portions thereof, as well as. peptide fragments suitable as 

25 immunogens to raise anti-Protease M antibodies. The invention provides an isolated 

preparation of Protease M, or a biologically active portion thereof. An "isolated" protein 
is substantially free of cellular material or culture medium when produced by 
recombinant DNA techniques, or chemical precursors or other chemicals when 
chemically synthesized. In a preferred embodiment, the Protease M protein has an 

30 amino acid sequence shown in SEQ ID NO: 2. In other embodiments, the Protease M 
protein is substantially homologous to SEQ ID NO: 2 and retains the functional activity 
of the protein of SEQ ID NO: 2 yet differs in amino acid sequence due to natural allelic 
variation or mutagenesis, as described in detail in subsection I above. Accordingly, in 
another embodiment, the Protease M protein is a protein which comprises an amino acid 

35 sequence at least 60 % homologous to the aniino acid sequence of SEQ ID NO: 2 and 
possesses a Protease M bioactivity in vitro. Preferably, the protein is at least 70 % 
homologous to SEQ ID NO: 2, more preferably at least 80 % homologous to SEQ ID 
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NO: 2, even more preferably, at least 90 % homologous to SEQ ID NQ: 2. In a 
particularly, preferred embodiment, a Protease M polypeptide is at least about 95 % 
homologoiis to SEQ. ID NO: 2. 

An isolated Protease M protein may comprise the entire amino acid 
sequence of SEQ ID NO: 2 (i.e., amino acids 1-244) or a biologically active portion 
thereof. For example, a biologically active portion of Protease M can comprise a mature 
form of Protease M in which a hydrophobic, amino-terminal signal sequence is absent, 
or which has been cleaved at a trypsin site. In one embodiment, such a mature form of 
Protease M comprises about amino acids 17-244 of SEQ ID NO: 2 and in another 
embodiment comprises about amino acids 22-244. The term "about" amino.acids 17- 
244 or 22-244 is intended to indicate that there is some flexibility in the amino-terminal 
residue, as discussed further in. subsection I above.. Moreover, other biologically active 
portions, in which other regions of the protein are deleted, can be prepared by 
recombinant techniques and evaluated for serine protease activity as described in detail 
above. 

Protease M proteins are preferably produced by recombinant DNA 
techniques. For example, a nucleic acid molecule encoding the protein is cloned into; an 
expression vector (as described above), the expression vector is introduced into a host 
cell (as described above) and the Protease M protein is expressed in the host cell. The 
Protease M protein can then be isolated from the cells by an appropriate purification 
scheme using standard protein purification techniques. Alternative to recombinant 
expression, a Protease M protein or polypeptide can be synthesized chemically using 
standard peptide synthesis techniques. Moreover, native Protease M protein can be 
isolated from cells (e.g., cultured human mammary epithelial cells), for example using 
an anti-Protease M antibody (discussed further below). 

In yet another embodiment of the present invention a Protease M protein 
is encoded by a nucleic acid of SEQ ID No:L In another embodiment a Protease M 
protein is encoded by a nucleic acid at least about 60%, preferably about 70%, or more 
preferably about 80% homologous to the nucleic acid of SEQ ID No:l. In a particularly 
preferred embodiment a Protease M protein is encoded by a nucleic acid at least about 
90% and preferably about 95% homologous to the nucleic acid of SEQ ID No: 1 . In still 
another embodiment of the. present invention a Protease M polypeptide is encoded by a 
nucleic acid which hybridizes to the nucleic acid of SEQ ID No:l under stringent 
conditions. . 

The invention also provides Protease M fusion proteins. As used herein, 
a Protease M "fusion protein" comprises a Protease M polypeptide operatively linked to 
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having an amino acid sequence corresponding to Protease M, whereas a "noh-Protease 
M polypeptide" refers to a polypeptide having an amino acid sequence corresponding to 
another protein. Within the fusion protein, the term "operatively linked" is intended to 
indicate that the Protease M polypeptide and the non-Protease M polypeptide are fused 
5 in-frame to each other. The non-Protease M polypeptide may be fused to the N-terminus 
or C-terminus of the Protease M polypeptide; For example, in one embodiment the 
fusion protein is a GST-Protease M fusion protein in which the Protease M sequences 
are fused to the C-terminus of the GST sequences (see Example 3). Such fusion proteins 
can facilitate the purification of recombinant Protease M. In another embodiment, the 

10 fusion protein is a Protease M protein containing a heterologous signal sequence at its 
N-terminus. For example, the native Protease M signal sequence (i.e., about amino 
acids 1-16) can be removed and replaced with a signal sequence from another protein. 
In certain host cells (e.g., mammalian host cells), eikpriessiGn. and/or secretion of Protease 
M may be increased through use of a heterologous signal sequence. - = - . 

1 5 Preferably, a Protease M fusion protein of the invention is produced by 

standard recombinant DNA techniques. For example, DN A fragments coding for the 
different polypeptide sequences are ligated together in-frame in accordance with 
conventional techniques, for example employing blunt-ended or stagger-ended termini, 
for ligation, restriction enzyme digestion to provide for appropriate termini, filling-in of 

20 cohesive ends as appropriate, alkaline phoisphatase treatment to avoid undesirable 
joining, and enzymatic ligation. In another embodiment, the fusion gene can be 
synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PGR amplification of gene fragments can be carried out using anchor 
primers which give rise to complementary overhangs between two consecutive gene 

25 fragments which can subsequently be annealed and reamplified to generate a chimeric 
gene sequence (see, for example, Current Protocols in Molecular Biology, eds. Ausubel 
et al. John Wiley & Sons: 1992). Moreover, many expression vectors are commercially 
available that already encode a fusion moiety (e.g., a GST polypeptide). A Protease M- 
encoding nucleic acid can be cloned into such an expression vector such that the fusion 

30 moiety is linked in-frame to the Protease M protein. . 

An isolated Protease M protein; or fragment thereof, can be used as an 
immunogen to generate antibodies that bind Protease M using standard techniques for 
polyclonal and monoclonal antibody preparation. In particularly preferred 
embodiments, the Protease M immunogen comprises an epitope unique to Protease M. 

35 The full-length Protease M protein can be used or, alternatively, the invention provides 
antigenic peptide fragments of Protease M for use as immunogens. The antigenic : 
peptide of Protease M comprises at least 8 amino acid residues of the amino acid 
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sequence shown in SEQ ID NO: 2 and encompasses an epitope of Protease M such that 
an antibody raised against the peptide forms a specific immurie complex with Protease 
M. Preferably, the antigenic peptide comprises at least 10 amino acid residues, more 
preferably at least 15 amino acid residues, even more preferably at least 20 amino acid 
residues, and most preferably at least 30 amino acid residues. Preferred epitopes 
encompassed by the antigenic peptide are regions of Protease M that are located on the 
surface of the protein, e.g., hydrophilic regions. Exemplary immunogens are described 
in more detail in the appended examples. 

A Protease M inmiunogen typically is used to prepare antibodies by 
immunizing a suitable subject, (e.g., rabbit, goat, mouse or other mammal) with the 
immunogen. An appropriate immunogenic preparation can contain, for example, 
recombinantly expressed Protease M protein or a chemically synthesized Protease M 
peptide. The preparation can further include an adjuvant, such as Freuhd's complete or 
incomplete adjuvant, or similar immunostimulatory agent. Immunization of a suitable 
subject with an immunogenic Protease M preparation induces a polyclonal anti-Protease 
M antibody response. . . . . 

Modification of the structure of the subject Protease M polypeptides can 
be for such purposes as enhancing therapeutic or prophylactic efficacy, stability (e.g., ejc 
vivo shelf life and resistance to proteolytic degradation in vivo), or post-translational ' 
modifications. Such modified peptides, when designed to retain at least one activity of 
the naturally-occurring form of the protein, or to produce specific antagonists thereof, 
are considered functional equivalents of the Protease M polypeptides described in more 
detail herein. Such modified peptides- can be produced, for instance, by amino acid 
substitution, deletion, or addition. 

For example, it is reasonable to expect that an isolated replacement of a 
leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a 
serine, or a similar reislacement of an amino acid with a structurally related amino acid 
(i.e. isosteric and/or isoelectric mutations) will not have a major effect on the biological 
activity of the resulting molecule. Conservative replacements are those that take place 
within a family of amino acids that are related in their side chains. Genetically encoded 
amino acids are can be divided into four families: (1) acidic = aspartate, glutamate; (2) 
basic = lysine, arginine, histidine; (3) nonpolar = alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan; and (4) uncharged polar = glycine, 
asparagine, glutamine, cysteine, serine, threonine, tyrosine. Phenylalanine, tryptophan, 
and tyrosine are sometimes classified jointly as aromatic amino acids. In similar 
fashion, the amino acid repertoire can be grouped as (1) acidic = aspartate, glutamate; 
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isoleucine, serine, threonine, vvith serine and threonine optionally be grouped separately 
as aliphatic-hydroxyl; (4) aromatic = phenylalanine, tyrosine, tryptophan; (5) amide=. 
asparagine, glutamine; and (6) sulfur -containing = cysteine and methionine, (see, for 
example, Biochemistry, 2nd.ed., Ed. by L. Stryer, WH Freeman and Co.: 1981): 
5 Whether a change in the amino acid sequence of a peptide results in a functional 

Protease M homolog (e.g. functional in the sense that the resulting polypeptide mimics 
or antagonizes the wild-type form) can be readily determined by assessing the ability of 
the variant peptide to produce a response in cells in a fashion similar to the wild-type 
protein, or competitively inhibit such a response. Polypeptides in which more than one 

10 replacement has taken place can readily be tested in the same maimer. 

Amino acid residues of Protease M that are strongly conserved among 
members of the chymotrypsin family of serine proteases (e.g., amino acid residues 
involved in substrate catalysis) are predicted to be essentiaLto the bioactiyity of Protease 
M and thus are not likely to be amenable to alteration. For example, the catalytic 

15 residues of a serine protease are Serl 95^ His?7 and Asp 102 (chymotrypsin numbering 
system). These three residues form a hydrogen bonding system often referred to as the 
catalytic triad, or the charge relay system (Powers and Harper^. supra). The catalytic 
triad of serine proteases is conserved in Protease M (i.e. histidine^^ aspartate ^ 06^ 
serine 1^^). , The aspartate at position 1 9 1 predicts that this protein will produce trypsin- 

20 like cleavage, and likewise, this residue may not be amenable to alteration. 

Protease M contains twelve cysteine residues. Ten of these are conserved 
in the two kallikreins, PSA and human trypsin and would be expected to form the 
following disulfide bridges: (Cys^S-Cys^^?) .(Cys^T-Cys^S)^ (Cysl38.Cys203)^ 
(Cysl68.Cysl82) and(Cysl93.Cys218) The other two cysteines (Cys^ 31 ^nd 

25 Cys23 1 ) are not found in the kallikreins, PSA and human trypsin^ but are found in 
similar positions in bovine trypsin and would be expected to form a disulfide bond. 

Twenty seven of the twenty nine 'invariant ' amino acids surrounding the 
active site of serine proteases (Dayhoff MO. (1978) NatL Biomed Res. Found, 
Washington DC, 5:Suppl. 3, pp. 79-81.) are conserved in Protease M. One of the two 

•30 nonconserved amino acids ileu^^^ jj^ Protease M in place of leu is a conservative 

change. The other nonconserved amino acid, his^^^ in Protease M instead of pro, is 
also found in glandular kallikrein and PSA. The kallikreins and PSA have 1 1 amino 
acids residues 109-1 19 which are not found in Protease M or trypsin. The fijnction of 
these amino acids is not clear, but they would be expected to form the so called 

35 kallikrein loop which would determine substrate specificity (Ashley PL, MacDonald RJ. 
(1985) Biochemistry 24:45 2-45 20.)- Given their conservation among the serine 
proteases, they are likely to be important in the bioactivity of protease M. Other 
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important amino, acid residues are predicted to be those which are conserved in 4 of the 
5 serine proteases shown in Figiire 3, . 

This invention further contemplates a method for generating sets of 
combinatorial mutants of the subject Protease M proteins as wen as truncation mutants; 
5 and is especially useful for identifying potential variant sequences (e.g. homologs) that 
have a Protease M activity. The purpose of screening such combinatorial libraries is to 
generate, for example, novelProtease M homologs which can act as either agonists or 
antagonists, or ahematively, possess novel activities all together. To illustrate. Protease 
M homologs can be engineered by the present method to provide selective, constitutive 
10 activation of enzymatic activity. Thus, combinatorially-derived homologs can be 
generated to have an increased potency relative to a naturally occurring form of the 
protein. 

Likewise, Protease M homologs can be generated by the present 
combinatorial approach to selectively inhibit (antagonize) a Protease M activity. For 

15 instance, mutagenesis can provide Protease M homologs which are able to prevent serine 
protease activity, e.g. the homologs can be dominant negative mutants. In a preferred 
embodiment, a dominant negative mutant of a Protease M protein is mutated at one or 
more residues of its catalytic site and/or specificity subsites. : : . 

In one aspect of this method, the amino acid sequences for a population 

20 of Protease M homologs or other related proteins are aligned, preferably to promote the 
highest homology possible. Such a population of variants can include, for example. 
Protease M. homologs from one or more species. Amino acids which appear at each 
position of the aligned sequences , are selected to create a degenerate set of combinatorial 
sequences. In a preferred embodiment, the variegated library of Protease M variants is 

25 generated by combinatorial mutagenesis at the nucleic acid level, and is encoded by a 
variegated gene library. For instance, a mixture of synthetic oligonucleotides can be - . 
enzymatically ligated into gene sequences such that the degenerate set of potential 
Protease M sequences are expressible as individual polypeptides, or alternatively, as a 
set of larger fusion proteins (e.g. for phage display) containing the set of Protease M 

30 sequences therein. 

There are many ways by which such libraries of potential Protease M 
homologs can be generated from a degenerate oligonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be carried out in an automatic DNA 
synthesizer, and the synthetic genes then ligated into an appropriate expression vector. 

35 The purpose of a degenerate set of genes is to provide, in one mixture, all of the 

sequences encoding the desired set of potential Protease M sequences. The synthesis of 
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(1983) Tetrahedron 39:3; Itakura et al. (1981) Recombin^t DNA, Proc 3rd Cleveland 
Sympos. Macromolecules, ed. AG Walton, Amsterdam: Elsevier pp273-289; Itakura et 
aL (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science 198:1056; Ike et 
al. (1 983) Nucleic Acid Res. 1 1 :477. Such techniques have been employed m the 
5 directed evolution of other proteins (see, for example, Scott et al. (1 990) Science 

249:386-390; Roberts et al. (1992) PNAS 89:2429-2433; Devlin et aL (1990) Science 
249: 404-406; Cwirla et al. (1990) PNAS 87: 6378-6382; as well as U.S. Patents Nos. 
5,223,409, 5,198,346, and 5,096,815). ' 

Likev^ise, a library of coding sequence fragments can be provided for a 

1 0 Protease M clone in order to generate a variegated population of Protease M fragments 
for screening and subsequent selection of bioactive fragments. A variety of techniques 
are known in the art for generating such libraries, including chemical synthesis. In one 
embodiment, a library of coding sequence fragments can be generated by (i) treating a 
double stranded PCR fragment of a Protease M coding sequence with a nuclease under 

15 conditions wherein nicking occurs only about'^once per molecule; (ii) denaturing the 

double stranded DNA; (iii) renaturing the DNA to form double stranded DNA which can 
include sense/antisense pairs from different nicked products; (iv) removing single 
stranded portions from reformed duplexes by treatment with SI nuclease; and (v) 
ligating the resulting fragment library into an expression vector. By this exemplary 

20 method, an expression library can be derived which codes for N-terminal,'C-terminal 
and intemal fragments of various sizes. 

A wide range of techniques are known in the art for screening gene 
products of combinatorial libraries made by point mutations or truncation, and for 
screening cDNA libraries for gene products having a certain property. Such techniques 

25 will be generally adaptable for rapid screening of the gene libraries generated by the 

combinatorial mutagenesis of Protease M homologs. The most widely used techniques 
for screening large gene libraries typically comprises cloning the gene library into 
replicable expression vectors, transforming appropriate cells with the resulting library of 
vectors, and expressing the combinatorial genes under conditions in which detection of a 

30 desired activity facilitates relatively easy isolation of the vector encoding the gene 
whose product was detected. 

In an exemplary embodiment, the library of Protease M variants is 
expressed as a fusion protein on the surface of a viral particle. For instance, in the 
filamentous phage system, foreign peptide sequences can be expressed on the surface of 

35 infectious phage, thereby conferring two significant benefits. First, since these phage 
can be applied to affinity matrices at very high concentrations, a large number of phage 
can be screened at one time. Second, since each infectious phage displays the 
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combinatorial gene product on its surface, if a particular phage is recovered from an 
affinity matrix in low yield, the phage can be amplified by another round of infection. 
The group of ahnost identical E. coli filamentous phages Ml 3, fd., and fl are most often 
used in phage display libraries, as either of the phage gllT or gVTII coat proteins can be 
5 used to generate fusion proteins without disrupting the ultimate packaging of the viral 
particle (Ladner et al. PCT publication WO 90/02909; Garrard et al., PCT publication 
WO 92/09690; Marks et al. (1992) J. Biol Chem. 267:16007-16010; Griffiths et al. 
(1993) EMBO J 12:725-734; Clackson et al. (1991) Nature 352:624-628; and Barbas et 
al. (1992) PA^^S". 89:4457-4461). 

10 For example, the recombinant phage antibody system (RPAS, Pharmacia 

Catalog number 27-9400-01) can be easily modified for use in expressing and screening 
Protease M combinatorial libraries by panning on glutathione immobilized 
substrate/GST fusion proteins to enrich for Protease M homologs which retain an ability 
to bind a substrate or regulatory protein. Each of these Protease M homologs can 

15 subsequently be screened for further biological activities in order to differentiate 
agonists and antagonists. For example, homologs isolated firom the combinatorial 
library can be tested for their enzymatic activity directly, or for their effect on cellular 
proliferation relative to the wild-type form of the protein. 

The invention also provides for reduction of the Protease M proteins to 

20 generate mimetics, e.g. peptide or non-peptide agents, which are able to disrupt a 

biological activity of a Protease M polypeptide of the present invention, e.g. as catalytic 
inhibitor or an inhibitor of protein-protein interactions. Thus, such mutagenic 
techniques as described above are also useful to map the determinants of the Protease M 
proteins which participate in protein-protein interactions. To illustrate, the critical 

25 residues of a subject Protease M polypeptide which are involved proteolytic cleavage 
can be used to generate Protease M-derived peptidomimetics which competitively 
inhibit binding of the authentic Protease M protein with that moiety. By employing, for 
example, scanning mutagenesis to map the amino acid residues of a protein which is 
involved in binding other proteins, peptidomimetic compounds can be generated which 

30 mimic those residues which facilitate the interaction. Such mimetics may then be used 
to interfere with the normal function of a Protease M protein. For instance, non- 
hydrolyzable peptide analogs of such residues can be generated using benzodiazepine 
(e.g., see Freidinger et al. in Peptides: Chemistry and Biology, G.R. Marshall ed., 
ESCOM Publisher: Leiden, Netherlands, 1988), azepine (e.g., see Huffman et al. in 

35 Peptides: Chemistry and Biology, G.R. Marshall ed., ESCOM Publisher: Leiden, 
Netherlands, 1988), substituted gamma lactam rings (Garvey et al. in Peptides: 
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1988), keto-methylene pseudopeptides (Ewenson et al, (1986) J Med Chem 29\295\ and 
Ewenson et al. in Peptides: Structure and Function (Proceedings of the 9th American 
Peptide Symposium) Pierce Chemical Co. Rockland, IL, 1985), b-tum dipeptide cores 
(Nagai et al. (1985) Tetrahedron Lett 26:647; and Sato et al. (1986) JChem Soc Peridn 
5 Trans 1:1231), and b-aminoalcohols (Gordon et al. ( 1 985) Biochem Biophys Res 
Commun 126:419; and Dann et al. (1986) Biochem Biophys Res Commun 134:7 \). , 

V. Antibodies 

Another aspect of the invention pertains to anti-Protease M antibodies. 

10 The term "antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin molecules, i.e., molecules that 
contain an antigen binding site v^hich specifically binds (immunoreacts with) an antigen, 
such as Protease M. The invention provides polyclonal and monoclonal antibodies that 
bind Protease M. The term "monoclonal antibody" or "monoclonal antibody 

15 composition", as used herein, refers to a population of antibody molecules that contain 
only one species of an antigen binding site capable of immunoreacting v^ith a particular 
epitope of Protease M. A monoclonal antibody composition thus typically displays a 
single binding affinity for a particular Protease M protein with which it immunoreacts. 

Polyclonal Protease M antibodies can be prepared as described above by 

20 immunizing a suitable subject with a Protease M immunogen, as described in more 

detail in the appended Examples. The anti-Protease M antibody titer in the immunized 
subject can be monitored over time by standard techniques, such as with an enzyme 
linked immunosorbent assay (ELISA) using immobilized Protease M. If desired, the 
antibody molecules directed against Protease M can be isolated from the mammal (e.g., 

25 firom the blood) and further purified by well known techniques, such as protein A 

chromatography to obtain the IgG fraction. At an appropriate time after immunization, 
e.g., when the anti-Protease M antibody titers are highest, antibody-producing cells can 
be obtained from the subject and used to prepare monoclonal antibodies by standard 
techniques, such as the hybridoma technique originally described by Kohler and 

30 Milstein (1975, Nature 256:495-497) (see also, Brown et al. (1981) J. Immunol 127:539- 
46; Brown et al. (1980) J5/o/ Chem 255:4980-83; Yeh et al. (1976) 76:2927-31; 
and Yeh et al. (1982) Int J. Cancer 29:269-75), the more recent human B cell 
hybridoma technique (Kozbor et al. (1983) Immunol Today 4:72), the EBV-hybridoma 
technique (Cole et al. (1985), Monoclonal Antibodies and Cancer Therapy, Alan R. 

35 Liss, Inc., pp. 77-96) or trioma techniques. The technology for producing monoclonal 
antibody hybridomas is well knovm (see generally R. H. Kenneth, in Monoclonal 
Antibodies: A New Dimension In Biological Analyses, Plenum Publishing Corp., New 
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York, New York (1980); E, A. Lemer (1981) Yale J. Biol. Med. 54-387-402- M L 
Gefter et al. (1,977) Somatic Cell Genet., 3:231-36). Briefly, an immortal cell line 
(typically a myeloma) is fused to lymphocytes (typically splenocytes) from a m.mm.l 
immunized with a Protease M immunogen as described above, and the culture- 
supematants of the resulting hybridoma cells are screened to identify a hybridoma 
producing a monoclonal antibody that binds Protease M. 

Any of the many well known protocols used for fusing lymphocytes and 
immortalized cell lines can be applied for .the purpose of generating an anti-Prbtease M 
monoclonal antibody (see, e.g., G. Galfre et al. (1977) Nature 266:55052; Gefter et al. 
Somatic Cell Genet., cited supra; Lemer, Yale J. Biol. Med , cited supra; Kenneth, 
Monoclonal Antibodies, cited supra). Moreover, the person of ordinary skill in the' art 
will appreciate that there are many variations of such methods which also would be 
useful. Typically, the immortal cell line (e.g., a myeloma cell line) is derived from the 
same mammalian species as the lymphocytes. For example, murine hybridomas can be 
made by fusing lymphocytes fi-om a mouse immunized with an immunogenic 
preparation of the present invention with an immortalized mouse cell line. Preferred 
immortal cell lines are mouse myeloma cell lines that are sensitive to culuire medium 
containing hypoxanthine, aminopterin and thymidine ("HAT medium"). Any of a 
number of myeloma cell lines may be used as a fusion partner according to standard 
techniques, e.g., the P3-NS 1/1-Ag4-1, P3-x63-Ag8.653 or Sp2/0-Agl4 myeloma lines. 
These myeloma lines are available from the American Type Culture Collection (ATCC), 
Rockville, Md. Typically. HAT-sensitive mouse myeloma cells are fused to mouse 
splenocytes using polyediylene glycol ("PEG"). Hybridoma cells resulting from the 
fusion are then selected using HAT medium, which kills unfiised and unproductively 
fused myeloma cells (unfused splenocytes die after several days because they afe not 
transformed). 

Hybridoma cells producing a monoclonal antibody of the invention are detected by 
screening the hybridoma culture supematants for antibodies that bind Protease M, e.g., 
using a standard ELISA assay. ' ' 

Alternative to preparing monoclonal antibody-secreting hybridomas, a 
monoclonal anti-Protease M antibody can be identified and isolated by screening a ' 
recombinant combinatorial immunoglobulin library (e.g., an antibody phage display 
library) with Protease M to thereby isolate immunoglobulin library members that bind 
Protease M. Kits for generating and screening phage display libraries are commercially 
available (e.g., the Pharmacia Recombinant Phage Antibody System, Catalog No. 27- 
9400-01; and the Stratagene 5«r/Z4pTM p;,^^^ ^^^^^^.^.^^ Catalog No. 240612) 
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generating and screening antibody display library can be found in, for example, Ladner 
et al. U.S. Patent No. 5,223,409; Kang et al. International Publication No. WO ' 
92/18619; Dower et al. International Publication No. WO 91/17271; Winter et al. 
International Publication WO 92/20791; Markland et al. International Publication No. 
5 WO 92/1 5679; Breitling et al. International Publication WO 93/01288; McCafferty et al. 
International Publication No. WO 92/01047; Garrard et al. International Publication No. 
WO 92/09690; Ladner et al. International Publication No. WO 90/02809; Fuchs et al. 
(1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum Antibod Hybridomas 3:81- 
85; Huse et al. (1989) Science 246:1275-1281; Griffiths et al. (1993) EMBO J 12:125- 

10 734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clarkson et al. (1991) Nature 
352:624-628; Gram et al. (1992) P7V^5 89:3576-3580; Garrad et al. (1991) 
Bio/Technology 9\\313A311\ Hoogenboom et al, (1991) Nuc Acid Res 19:4133-4137; 
Barbas et al, (1991) PA^^5 88:7978-7982; and McCafferty et al. Ato/wre (1990) 348:552- 
554. - . - ^ . . . , . . . • .,1 . • 

15 : Additionally, recombinant anti-Protease M antibodies, such as chimeric 

and humanized monoclonal antibodies, comprising both human and non-human 
portions, which can be made using standard recombinant DNA techniques, are within 
the scope of the invention. Such chimeric and humanized monoclonal antibodies can be 
produced by recombinmt DNA techniques known in the art, for example using methods 

20 described in Robinson et al. International Patent Publication PCT/US86/02269; Akira, et 
al. European Patent Application 184,187; Taniguchi, M., European Patent Application 
171,496; Morrison et al. European Patent Application 173,494; Neuberger et al. PCT 
Application WO 86/01533; Cabilly et al. U.S. Patent No. 4,816,567; Cabilly et al. 
European Patent Application 125,023; Better et al. (1988) Science 240:1041-1043; Liu 

25 et al. (1987) PA^^5 84:3439-3443; Liu et al. (1987) J. Immunol 139:3521-3526; Sun et 
al. (1987) 84:214-218; Nishimura et al. (1987) Cane, Res. 47:999-1005; Wood et 

al. (1985) Nature 314:446-449; and Shaw et al. (1988) J. Natl Cancer Inst. 80:1553- 
1559); Morrison, S. L. (1985) Science 229:1202-1207; Oi et al. (1986) BioTechniques 
4:214; Winter U.S. Patent 5,225,539; Jones et al. (1986) Nature 321:552-525; 

30 Verhoeyan et al. (1988) Science 239:1534; and Beidler et al. (1988) J. Immunol 
141:4053-4060. 

An anti-Protease M antibody (e.g., monoclonal antibody) can be used to 
isolate Protease M by standard techniques, such as affinity chromatography or 
immunoprecipitation. An anti-Protease M antibody can facilitate the purification of 
35 natural Protease M from cells and of recombinantly produced Protease M expressed in 
host cells. Moreover, an anti-Protease M antibody can be used to detect Protease M 
protein (e.g., in a cellular lysate or cell supernatant). Detection may be facilitated by 
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coupling (i.e., physically linking) the antibody to a detectable substance. Examples of 
detectable substances include various enzymes, prosthetic groups, fluorescent materials, 
luminescent materials and radioactive materials. Examples of suitable enzymes include 
horseradish peroxidase, alkaline phosphatase, P-galactosidase, or acetylchoKne^erase; 
5 examples of suitable prosthetic group complexes include streptavidin/biotin and 
avidin/biotin; examples of suitable fluorescent materials include umbelliferone, 
fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, 
dansyl chloride or phycoerythrin; ah example of a luminescent material includes 
luminol; and examples of suitable radioactive material include ^^^I, ^-^h, -^^S or -^H. 
10 . 
VL Transgenic animals 

Another aspect of the invention features transgenic non -human animals 
which express a heterologous Protease M gene of the present invention, or which have 
had one or more genomic Protease M genes disrupted in at least one of the tissue or cell- 

15 types of the animal. Accordingly, the invention features an animal model for 

proliferative disorders, which animal has one or more Protease M allele which is mis- 
expressed. For example, a mouse can be bred which has one or more Protease M alleles 
deleted or otherwise rendered inactive. Such a mouse model can then be used to study 
disorders arising from mis-expressed Protease M genes, as well as for evaluating 

20 potential therapies for similar disorders. 

Another aspect of the present invention concerns transgenic animals 
which are comprised of cells (of that animal) which contain a transgene of the present 
invention and which preferably (though optionally) express an exogenous Protease M 
protein in one or more cells in the animal. A Protease M transgene can encode the wild- 

25 lypQ form of the protein, or can encode homologs thereof, includmg both agonists and 

antagonists, as well as antisense constructs. In preferred embodiments, the expression of 
the transgene is restricted to specific subsets of cells, tissues or developmental stages 
utilizing, for example, cis-acting sequences that control expression in the desired pattern. 
In the present invention, such mosaic expression of a Protease M protein can be essential 

30 for many forms of lineage analysis and can additionally provide a means to assess the 
effects of, for example, lack of Protease M expression which might grossly alter 
development in small patches of tissue within an otherwise normal embryo. Toward this 
end, tissue-specific regulatory sequences and conditional regulatory sequences can be 
used to control expression of the transgene in certain spatial patterns. Moreover, 

35 temporal patterns of expression can be provided by, for example, conditional 
recombination systems or prokaryotic transcriptional regulatory sequences. 
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Genetic techniques which allow for the expression of transgeries can be 
regulated via site-specific genetic manipulation in vivo are. known to those skilled in the 
arL For instance, genetic systems are available which allow for the regulated expression 
of a recombinase that catalyzes the genetic recombination a tai^et sequence. As used 
5 herein, the phrase "target sequence" refers to a nucleotide sequence that is genetically 
recombined by a recombinase. The target sequence is flanked by recombinase 
recognition sequences and is generally either excised or inverted in cells expressing 
recombinase activity. Recombinase catalyzed recombination events can be designed 
such that recombination of the target sequence results in either the. activation or 

10 repression of expression of one of the subject Protease M proteins. For example, 
excision of a target sequence which interferes with the expression of a recombinant 
Protease M gene, such as one which encodes an antagonistic homolog or an antisense 
transcript, can be designed to activate expression, of that gene. This interference with 
expression of the protein can result from a variety of mechanisms, such as spatial 

1 5 separation of the Protease M gene from the promoter element or. an internal stop codon. 
Moreover, tlie transgene can be made wherein the coding sequence of the gene is 
flanked by recombinase recognition sequences and is initially transfected into cells in a 
3' to 5' orientation with respect to the promoter element. In such an instance, inversion 
of the target sequence will reorient the subject gene by placing the 5' end of the coding 

20 sequence in an orientation with respect to the promoter element which allow for 
promoter driven transcriptional activation. 

In an illustrative embodiment, either the cre/loxP recombinase system of 
bacteriophage PI (Lakso et al. (1992) PNAS 89:6232-6236; Orban et al. (1992) PNAS 
89:6861-6865) or the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman 

25 et al. (1991) Science 251:1351-1355; PCT publication WO 92/15694) can be used to 

generate in vivo site-specific genetic recombination systems. Cre recombinase catalyzes 
the site-specific recombination of an intervening target sequence located between loxP 
sequences. loxP sequences are 34 base pair nucleotide repeat sequences to which the 
Cre recombinase binds and are required for Cre recombinase mediated genetic 

30 recombination. The orientation of loxP sequences determines whether the intervening 
target sequence is excised or inverted when Cre recombinase is present (Abremski et al. 
(1984) J. BioL Chem. 259:1509-1514); catalyzing the excision of the target sequence 
when the loxP sequences are oriented as direct repeats and catalyzes inversion of the 
target sequence when loxP sequences are oriented as inverted repeats. 

35 Accordingly, genetic recombination of the target sequence is dependent 

on expression of the Cre recombinase. Expression of the recombinase can be regulated 
by promoter elements which are subject to regulatory control, e.g., tissue-specific. 
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developmental stage-specific, inducible or repressible by externally added agents. This 
regulated control will result in genetic recombination of the target sequence only in cells 
where recombinase expression is mediated by the promoter element Thus, the 
activation expression of a recombinant Protease M protein can be regulated via control 
of recombinase expression. 

Use of the cre/loxP recombinase system to regulate expression of a 
recombinant Protease M protein requires the construction of a transgenic animal 
containing transgenes encoding both the Gre recombinase and the subject protein. 
Animals containing both the Cre recombinase and a recombinant Protease M gene can 
be provided through the construction of "double" transgenic animals. A convenient 
method for providing such animals is to mate two transgenic animals each containing a 
transgene, e.g., a Protease M gene and recombinase gene. 

One advantage derived from initially constructing transgenic animals 
containing a Protease M transgene.in a recombinase-mediated expressible format derives 
from the likelihood that the subject protein, whether agonistic or antagonistic, can be 
deleterious upon expression in the transgenic animal. In such an instance, a founder 
population, in which the subject transgene is silent in all tissues, can be propagated and 
maintained. Individuals of this founder population can be crossed with animals - 
expressing the recombinase in, for example, one or more tissues and/of a desired 
temporal pattern. Thus, the creation of a founder population in which, for example, an 
antagonistic Protease M transgene is silent will allow the study of progeny from that 
founder in which disruption of Protease M mediated induction in a particular tissue or at 
certain developmental stages would result in, for example, a lethal phenotype. 

Similar conditional transgenes can be provided using prokaryotic 
promoter sequences which require prokaryotic proteins to be simultaneous expressed in 
order to facilitate expression of the Protease M transgene. Exemplary promoters and the 
corresponding trans-activating prokaryotic proteins are given in U.S. Patent No. 
4,833,080. 

Moreover, expression of the conditional transgenes can be induced by 
gene therapy-like methods wherein a gene encoding the trans-activating protein, e.g. a 
recombinase or a prokaryotic protein, is delivered to the tissue and caused to be 
expressed, such as in a cell-type specific manner. By this method, a Protease M 
transgene could remain silent into adulthood until "turned on" by the introduction of the 
trans-activator. 

In an exemplary embodiment, the "transgenic non-human animals" of the 
invention are produced by introducing transgenes into the germline of the non-human 
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transgenes. Different methods are used depending on the stage of development of the 
embryonic target cell. The zygote is the best target for micro-injection. In the mouse; the 
male pronucleus reaches the size of approximately 20 micrometers in. diameter which 
allows reproducible injection of I -2pI of DNA solution. The use of zygotes as a target 
5 for gene transfer has a major advantage in that in most cases the injected DNA will be 
incorporated into the host gene before the first cleavage (Brinster et al. (1985) PNAS 
82:4438-4442). As a . consequence, all cells of the transgenic non-human animal will 
carry the incorporated transgene. This vnll in general also be reflected in the efficient 
transmission of the transgene to offspring of the founder since 50% of the germ cells 

1 0 will harbor the transgene. Microinjection of zygotes is the preferred method for 
incorporating transgenes in practicing the invention. 

Retroviral infection can also be used to introduce Protease M transgenes 
into a non-human animal. The developing non-human embryo can be cultured in vitro to 
the blastocyst stage. During this time, the blastomeres can be targets for retroviral 

15 infection (Jaenich, R. (1976) PNAS 73:1260-1264). Efficient infection of the 
blastomeres is obtained by enzymatic treatment to remove the zona pellucida 
(Manipulating the Mouse Embryo, Hogan eds. (Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, 1986). The viral vector system used to introduce the transgene is 
typically a replication-defective retrovirus carrying the transgene (Jahner et al. (1985) 

20 PNAS 82:6927-6931; Van der Putten et al. (1985) PNAS 82:6148-6152). Transfection is 
easily and efficiently obtained by culturing the blastomeres on a monolayer of virus- ' 
producing cells (Van der Putten, supra; Stewart et al. (1987) EMBO J. 6:383-388). 
Alternatively, infection can be performed at a later stage. Virus or virus-producing cells 
can be injected into the blastocoele (Jahner et al. (1982) Nature 298:623-628). Most of 

25 the founders will be mosaic for the transgene since incorporation occurs only in a subset 
of the cells which formed the transgenic non-human animal. Further, the foimder may 
contain various retroviral insertions of the transgene at different positions in the genome 
which generally will segregate in the offspring. In addition, it is also possible to 
introduce transgenes into the germ line by intrauterine retroviral infection of the 

30 midgestation embryo (Jahner et al. (1982) supra). 

A third type of target cell for transgene introduction is the embryonic 
stem cell (ES). ES cells are obtained from pre-implantation embryos cultured in vitro 
and fused with embryos (Evans et al. (1981) Nature 292:154-156; Bradley et al. (1984) 
Nature 309:255-258; Gossler et al. (1986) PNAS 83: 9065-9069; and Robertson et al. 

35 (1986) Nature 322:445-448). Transgenes can be efficiently introduced into the ES cells 
by DNA transfection or by retro virus-mediated transduction. Such transformed ES cells 
can thereafter be combined with blastocysts from anon-human animal. The ES cells 
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thereafter colonize the embryo and contribute to the germ line of the resulting chimeric 
animal; For review see Jaenisch, R. ( 1 98 8) Science 240: 1 468- 1 474. 

: Methods of making Protease M knock-out or disruption transgenic 
animals are also generally known. See, for example. Manipulating the Mouse Embryo, 
(Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Recombinase 
dependent knockouts can also be generated, e.g. by homologous recombination to insert 
recombinase target sequences flanking portions of an endogenous Protease M gene, such 
that tissue specific and/or temporal control of inactivation of a Protease M allele can be 
controlled as above. 

VII. Pharmaceutical Compositions 

The Protease M proteins. Protease M nucleic acids, and anti-Protease M 
antibodies of the invention can be incorporated into pharmaceutical compositions 
suitable for administration. Such Compositions typically comprise the protein or 
antibody and a pharmaceutically acceptable carrier. As used herein the term 
"pharmaceutically acceptable carrier" is intended to include any and all solvents, 
dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption 
delaying agents, and the like, compatible with pharinaceutical administration. The use 
of such media and agents for pharmaceutically active substances is well known in the 
art. Except msofar as any conventional media or agent is incompatible with the active 
compound, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions. 

A pharmaceutical composition of the invention is formulated to be 
compatible with its intended route of administration. For example, solutions or 
suspensions used for parenteral, intradermal, or subcutaneous application can include the 
following components: a sterile diluent such as water for injection, saline solution, fixed 
oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; 
antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as 
ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic 
acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of 
tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, 
such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be 
enclosed in ampoules, disposable syringes or multiple dose vials made of glass or 
plastic. 

Pharmaceutical compositions suitable for injectable use include sterile 
aqueous solutions (where water soluble) or dispersions and sterile powders for the 
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intravenous administration, suitable carriers include physiological saline, bacteriostatic 
water, Cremophor EL^M (BASF, Parsippany, NJ) or phosphate buffered saline (PBS), 
la all cases, the composition must be sterile and should.be fluid to the extent that easy 
syringability exists. It must be stable imder the conditions of manufacture and storage 
and must be preserved against the contaminating action of microorganisms such as 
bacteria and fungi. The carrier can be a solvent or dispersion mediumxontaining, for 
example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid 
polyetheylene glycol, and the like), and suitable mixtures thereof The proper fluidity 
can be maintained, for example, by the use of a coating such as lecithin, by the 
maintenance of the required particle size in the case of dispersion and by the use of 
surfactants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, 
ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include 
isotonic agents, for example, sugars, poly alcohols such as manitol, sorbitol, sodium 
chloride in the composition. Prolonged absorption of the injectable compositions can be 
brought about by including in the composition an agent which delays absorption, for 
example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active 
compound (e.g., a Protease M protein or anti-Protease M antibody). in the required 
amount in an appropriate solvent with one or a combination of ingredients enumerated 
above, as required, followed by filtered sterilization. Generally, dispersions are prepared 
by incorporating the active compound into a sterile vehicle which contains a basic 
dispersion medium and the required other ingredients from those enumerated above. In 
the case of sterile powders for the preparation of sterile injectable solutions, the 
preferred methods of preparation are vacuum drying and freeze-drying which yields a 
powder of the active ingredient plus any additional desired ingredient from a previously 
sterile-filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. 
They can be enclosed in gelatin capsules or compressed into tablets. For the purpose of 
oral therapeutic administration, the active compound can be incorporated with excipients 
and used in the form of tablets, troches, or capsules. Oral compositions can also be 
prepared using a fluid carrier for use as a mouthwash, wherein the compound, in the fluid 
carrier is applied orally and swished and expectorated or swallowed. Pharmaceutically 
compatible binding agents, and/or adjuvant materials can be included as part of the 
composition. The tablets, pills, capsules, troches and the like can contain any of the 
following ingredients, or compounds of a similar nature: a binder such as 
microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or 
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lactose, a disintegrating agent such as alginic acid, Primogel, or com starch; a lubricant 
such as magnesium stearate or Sterotes; a gUdant such as colloidal silicon dioxide; a 
s\yeetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl salicylate, or orange flavoring. 
5 In one embodiinient, the active compounds are prepared with carriers that 

will protect the compound against rapid elimination from the body, such as a controlled 
release formulation, including implants and microencapsulated delivery systems. 
Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, 
polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. 

10 Methods for preparation of such formulations will be apparent to those skilled in the art. 
The materials can also be obtained commercially from Alza Corporation and Nova 
Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected 
cells with monoclonal antibodies to viral antigens) can also be used as pharmaceutically 
acceptable carriers. These may .be prepared according to methods known to those skilled 

15 intheart, for example, as described in U.S. Patent No. 4,522,811. ^ 

It is especially advantageous to formulate oral or parenteral compositions 
in dosage unit form for ease of administration and uniformity of dosage. Dosage unit 
form as used herein refers to physically discrete units suited as unitary dosages for the 
subject to be treated; each unit containing a predetermined quantity of active compound 

20 calculated to produce the desired therapeutic effect in association with the required 

pharmaceutical carrier. The specification for the dosage unit forms of the invention are 
dictated by and directly dependent on (a) the unique characteristics of the active 
compound and the particular therapeutic effect to be achieved, and (b) the limitations 
inherent in the art of compounding such an active compound for the treatment of ; 

25 individuals. 

VIIL Uses and Methods of the Invention 

As described in more detail in the appended Examples, the Protease M 
protein of the invention exhibits serine protease activity. Accordingly, Protease M is 

30 useful as a serine protease, either in vitro or in vivo. The isolated nucleic acid molecules 
of the invention can be used to express Protease M protein (e.g., via a recombinant 
expression vector in a host cell), to detect Protease M mRNA (e.g., in a biological 
sample) and to modulate Protease M activity, as discussed and further below. Moreover, 
the anti-Protease M antibodies of the invention can be used to detect and isolate Protease 

35 M protein and modulate Protease M activity, also discussed further below. 
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A. Diagnostic and Prognostic Assays • ' = ' - 

The present method provides a method for determining if a subject is at 
risk for a disorder characterized by aberrant cell proliferation and/or differentiation. In 
preferred embodiments, the methods can be characterized as comprising detecting, in a 
5 sample of cells from the subject, the presence or absence of a genetic lesion 

characterized by at least one of (i) an alteration affecting the integrity of a gene encoding 
a Protease M-protein, or (ii) the mis-expression of the Protease A/ gene. To illustrate, 
such genetic lesions can be detected by ascertaining the existence of at least one of (i) a 
deletion of one or more nucleotides from a Protease M gene, (ii) an addition of one or 

10 more nucleotides to a Protease A/ gene, (iii) a substitution of one or more nucleotides of 
a Protease M gene, (iv) a gross chromosomal rearrangement of a Protease M gene, (v) a 
gross alteration in the level of a messenger RNA transcript of a Protease M gene, (vii) 
aberrant modification of a Protease M gene, snch as of the methylation pattern of the 
genomic DNA, (vii) the presence of a non- wild type splicing pattern of a messenger 

15 RNA transcript of a Protease M gene, (viii) a nori-w^ild type level of a Protease M- 
protein, (ix) allelic loss of a Protease M gene, and (x) inappropriate post-translational 
'• modification of a Protease A^/-protein. As set out below, the present invention provides 
a large number of assay techniques for detecting lesions in a Protease M gene, and ^ 
importantly, provides the ability to discern between different molecular causes - 

20 underlying Protease M-dependent aberrant cell growth, proliferation and/or 
differentiation. 

In an exemplary embodiment, there is provided a nucleic acid 
composition comprising a (purified) oligonucleotide probe including a region of 
nucleotide sequence which is capable of hybridizing to a sense or antisense sequence of 

25 a Protease M gene, such as represented by any of SEQ ID Nos: 1 and 3, or naturally 

occurring mutants thereof, or 5' or 3' flanking sequences or intronic sequences naturally 
associated with the subject Protease M genes or naturally occurring mutants thereof. 
The nucleic acid of a cell is rendered accessible for hybridization, the probe is contacted 
with nucleic acid of the sample, and the hybridization of the probe to the sample nucleic 

30 acid is detected. Such techniques can be used to detect lesions at either the genomic or 
mRNA level, including deletions, substitutions, etc., as well as to determine mRNA 
transcript levels. 

As set out above, one aspect of the present invention relates to diagnostic 
assays for determining, in the context of cells isolated from a patient, if mutations have 
35 arisen in one or more Protease M of the sample cells. The present method provides a 

method for determining if a subject is at risk for a disorder characterized by aberrant cell 
proliferation and/or metastasis. In preferred embodiments, the method can be generally 
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characterized as comprising detecting, in a sample of cells from the subject, the presence 
or absence of a genetic lesion characterized by an alteration affecting the integrity of a 
gene encoding a Protease M. To illustrate, such genetic lesions can be detected by 
ascertaining the existence of at least one of (i) a deletion of one or more nucleotides 
from a Protease M-gene, (ii") an addition of one or more nucleotides to a Protease M- 
gene, (iii) a.-substitution of one or more nucleotides of a Protease M-gQnt, and (iv) the 
presence of a non-wild type splicing pattern of a messenger RNA transcript of a 
Protease M-gene. As set out below, the present invention provides a large number of 
assay techniques for detecting lesions in Protease genes, and importantly, provides 
the ability to discern between different molecular causes underiying Protease M- 
dependent aberrant cell growth and/or metastasis. 

In certain embodiments, detection of the lesion comprises utilizing the 
probe/primer in a polymerase chain reaction (PGR) (see, e.g. U.S. Patent Nbs. 4,683,1 95 
and 4,683,202), such as anchor PGR or RACE PGR, or, alternatively, in a ligatioh chain 
reaction (LGR) (see, e.g., Landegran et al. (1988) Science 241 :1077- 1080; knd- 
Nakazawa et al. (1994) PNAS 91-360-364), the latter of which can be particularly useful 
for detecting point mutations in the Protease A/-gene (see Abravaya et al. (1995) Nuc 
Acid Res 23:675-682). In a merely illustrative embodiment, the method includes the 
steps of (i) collecting a sample of cells from a patient, (ii) isolating nucleic acid (e.g., 
genomic, mRNA .or both) from the cells of the sample, (iii) contacting the nucleic acid 
sample with one or more primers which specifically hybridize to a Protease M gene 
under conditions such that hybridization and amplification of the Protease Af-gene (if 
present) occurs, and (iv) detecting the presence or absence of an amplification product, 
or detecting the size of the amplification product and comparing the length to a control 
sample. It is anticipated that PGR and/or LGR may be desirable to use as a prelimmary 
amplification step in conjunction with any of the techniques used for detecting mutations 
described herein. Alternative amplification methods include: self sustained 
sequence replication (Guatelli, J.C. et al., 1990, Proc. Natl. Acad. Sci. USA 87:1874- 
1878), transcriptional amplification system (Kwoh, D.Y. et al., 1989, Proc. Natl. Acad. 
Sci. USA 86:1 173-1 177), Q-Beta Replicase (Lizardi, P.M. et al., 1988, Bio/Technology 
6:1 197), or any other nucleic acid amplification method, followed by the detection of the 
amplified molecules using techniques well known to those of skill in the art. These 
detection schemes are especially useful for the detection of nucleic acid molecules if 
such molecules are present in very low numbers. 

In another embodiment of tiie subject assay, mutations in a Protease M 
gene from a sample cell are identified by alterations in restriction enzyme cleavage 
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digested with one or more restriction endonucleases, and fragment length sizes ai-e 
determined by gel electrophoresis. Moreover, the use of sequence specific ribozymes 
(see, for example, U.S. Patent No. 5,498,53 1) can be used to score for the presence of 
specific mutations by development or loss of a ribozyme cleavage site." 
5 In yet another embodiment, any of a variety of sequencing reactions 

knovm in the art can be used to directly sequence the Protease M g'^n^ and detect 
mutations by comparing the sequence of the sample Protease Mw\t\i the corresponding 
v^ld-type (control) sequence. Exemplary sequencing reactions include those based on 
techniques developed by Maxim and Gilbert {Proc. Natl Acad Sci USA (1977) 74:560) 

10 or Sanger (Sanger et al (1977) Proc. Nat, Acad Sci 74:5463). It is also contemplated 
that any of a variety of automated sequencing procedures may be utilized when 
performing the subject assays (Biotechmques (1995) 19:448), including by sequencing 
by mass spectrometry (see, for example PCT publication WO 94/16101; Cohen et al. 
(1996) Adv Chromatogr 36:127-162; and Griffm etal. (1993) AppI Biochem Biotechnol 

15 38:147-159). It will be evident to one skilled in the art that, for certain embodiments, 
the occurrence of only one, two or three of the nucleic acid bases need be determined in 
the sequencing reaction. For instance, A-tract or the like, e.g., where only one nucleic 
acid is detected, can be carried out. . w ' . 

In a further embodiment, protection from cleavage agents (such as a 

20 nuclease,. hydroxy lamine or osmium tetroxide and with piperidine) can be used to detect 
mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers, et al. (1985) - 
Science 230: 1 242). In general, the art technique of "mismatch cleavage" starts by 
providing heteroduplexes of formed by hybridizing (labeled) RNA or DNA containing 
the wild-type Protease M scqwence with potentially mutant RNA or DNA obtained from 

25 a tissue sample. The double-stranded duplexes are treated with an agent which cleaves 
single-stranded regions of the duplex such as which will exist due to basepair 
mismatches between the control and sample strands. For instance, RNA/DNA duplexes 
can be treated v^th RNase and DNA/DNA hybrids treated with SI nuclease to 
enzymatically digesting the mismatched regions. In other embodiments, either 

30 DNA/DNA or RNA/DNA duplexes can be treated with hydroxylamine or osmium 

tetroxide and with piperidine in order to digest mismatched regions. After digestion of 
the mismatched regions, the resulting material is then separated by size on denaturing 
polyacrylamide gels to determine the site of mutation. See, for example, Cotton et al 
(1988) Proc. Natl Acad Sci USA 85:4397; Saleeba et al (1992) Methods Enzymod. 

35 217:286-295. In a preferred embodiment, the control DNA or RNA can be labeled for 
detection. 
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In still another embodiment, the mismatch cleavage reaction employs one 
or more proteins that recognize mismatched base pairs in double-stranded DNA (so 
called "DNA mismatch repair" enzymes) in defined systems for detecting and mapping 
point mutations in Protease M cDNAs obtained from samples of cells. For example, the 
5 mutY enzyme of E, coli cleaves A at G/A mismatches and the thymidine DNA 
glycosylase fi-om HeLa cells cleaves T at G/T mismatches (Hsu et al. (1994) 
Carcinogenesis 15:1657-1662). According to an exemplary embodiment, a probe based 
on a Protease sequence, e.g., a wild-type Protease Af sequence, is hybridized to a 
cDNA or other DNA product from a test cell(s). The duplex is treated with a DNA 

10 mismatch repair enzyme, and the cleavage products, if any, can be detected from 
electrophoresis protocols or the like. See, for example, U.S. Patent No, 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used 
to identiiy mutations in Protease M genes. For example, single strand conformation 
polymorphism (SSCP) may be used to detect differences in electrophoretic mobility 

1 5 between mutant and wild type nucleic acids (Orita et al. (1 989) Proc Natl Acad, Sci 
USA 86:2766, see also Cotton (1993) Mutat Res 285:125-144; and Hayashi (1992) 
Genet Anal Tech AppI 9:73-79). Single-stranded DNA fragments of sample and control 
Protease M nucleic acids will be denatured and allowed to renature. The secondary 
structure of single-stranded nucleic acids varies according to sequence, the resulting 

20 alteration in electrophoretic mobility enables the detection of even a single base change. 
The DNA fragments may be labeled or detected with labeled probes. The sensitivity of 
the assay may be enhanced by using RNA (rather than DNA), in which the secondary 
structure is more sensitive to a change in sequence. In a preferred embodiment, the 
subject method utilizes heteroduplex analysis to separate double stranded heteroduplex 

25 molecules on the basis of changes in electrophoretic mobility (Keen et al. (1991) Trends- 
Genet 7:5). 

In yet another embodiment the movement of mutant or wild-type 
fragments in poly aery lamide gels containing a gradient of denaturant is assayed using 
denaturing gradient gel electrophoresis (DGGE) (Myers et al (1985) Nature 313:495). 

30 When DGGE is used as the method of analysis, DNA will be modified to insure that it 
does not completely denature, for example by adding a GC clamp of approximately 40 
bp of high-melting GC-rich DNA by PGR. In a further embodiment, a temperature 
gradient is used in place of a denaturing agent gradient to identify differences in the 
mobility of control and sample DNA (Rosenbaum and Reissner (1987) Biophys Chem 

35 265:12753). 

' Examples of other techniques for detecting point mutations include, but 
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selective primer extension.* For example, oligonucleotide primers may be prepared in 
which the known mutation is placed centrally and then hybridized to target DN A- under 
- conditions which permit hybridization only if a perfect match is found (Saiki et al. 
(1986) A/amre 324:163); Saiki etal (1989) />roc. Natl Acad Sci USA 86:6230). Such 
5 allele speicific oligonucleotide hybridization techniques may be .used to test one 
mutation per reaction when oligonucleotides are hybridized to PGR amplified target 
DNA or a number of different mutations when the oligonucleotides are attached to the 
hybridizing membrane and hybridized with labeled target DNA. 

Alternatively, allele specific amplification technology .which depends on 

1 0 selective PGR amplification may be used in conjunction with the instant invention. 

Oligonucleotides used as primers for specific amplification may carry the mutation of 
interest in the center of the molecule (so that amplification depends on differential 
hybridization) (Gibbs et al (1989) Nucleic Acids 17:2437-2448) or at the extreme 3* 
end of one primer where, under appropriate conditions, mismatch can prevent, or reduce 

15 polymerase extension (Prossner (1993) Tibtech 1 1:238. In addition it may be desirable 
to introduce a novel restriction site in the region of the mutation to create cleavage-based 
detection (Gasparini et al (1992). Mo/, Cell Probes 6:1). It is anticipated that in certain 
embodiments amplification may also be performed using Taq ligase for amplification 
(Barany (1991) Proc, Nad. Acad. Sci f/S^ 88:189). In such cases, ligation will occur 

20 only if there is a perfect match at the 3' end of the 5' sequence making it possible to 

detect the presence of a known mutation at a specific site by looking for the presence or 
absence of amplification. . - . . ^ 

Another embodiment of the invention provides, for a nucleic acid 
composition comprising a (purified) oligonucleotide probe including a region of 

25 nucleotide sequence which is capable of hybridizing to a sense or antisense sequence of 
a Protease M-gene, or naturally occurring mutants thereof, or 5' or 3' flanking sequences 
or intronic sequences naturally associated with the subject Protease A/-genes or naturally 
occurring mutants thereof. The nucleic acid of a cell is rendered accessible for 
hybridization, the probe is exposed to nucleic acid of the sample, and the hybridization 

30 of the probe to the sample nucleic acid is detected. Such techniques can be used to 

detect lesions at either the genomic or mRNA level, including deletions, substitutions, 
etc., as well as to determine mRNA transcript levels. Such oligonucleotide probes can 
be used for both predictive and therapeutic evaluation of allelic mutations which might 
be manifest in, for example, neoplastic or hyperplastic disorders (e.g. aberrant cell 

35 growth). 

To illustrate, nucleotide probes can be generated from the subject 
Protease M gene which facilitate histological screening of intact tissue and tissue 
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samples for the presence (or absence) of Protease M-encoding transcripts. Similar to the 
diagnostic uses of anti-Protease M antibodies, the use of probes directed to Protease M 
messages, or to genomic Protease M sequences, can be used for both predictive and 
therapeutic evaluation of aUeUc mutations which might be manif^ in, for example 
neoplastic.or hyperplastic disorders (e.g. unwanted cell growth) or abnormal 
differentiation of tissue. Used in conjunction with immunoassays as described above 
the ohgonucleotide probes can help facilitate the determination of the molecular basis' 
for a developmental disorder which may involve some abnormality associated with 
expression (or lack thereof) of a Protease M protein. For instance, variation in 
polypeptide synthesis can be differentiated from a mutation in a coding sequence. 

Diagnostic procedures may be performed on any "biological sample" 
mcludmg, for example, cells, body fluids, or in situ directly upon tissue sections (fixed 
and/or frozen) of patient tissue obtained from biopsies or resections, such that no nucleic 
acid purification is necessary. 

Antibodies directed against wild type or mutant Protease Mproteins 
which are discussed, above, may also be used in disease diagnostics and prognostics 
Such diagnostic methods, may be used to detect abriormalities in the level of Protease 
Mprotein expression, or abnormalities in the structure and/or tissue, cellular or 
subcellular location of Protease M protein. Structural differences may include for 
example, differences in the size, electronegativity, or antigenicity of the mutant' - 
Protease A/protein relative to the normal Protease M protein. Protein from the tissue 
or cell type to be analyzed may easily be detected or isolated using techniques which are 
well known to one of skill in the art, including but not limited to western blot analysis 
For a detailed explanation of methods for carrying out westem blot analysis see 
Sambrooketal, 1989. supra, at Chapter 18. The protein detection and isolation methods 
employed herein may also be such as those described in Harlow and Lane, for example 
(Harlow, E. and Lane, D., 1988, "Antibodies: A Laboratory Manual", Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, New York), which is incorporated herein 
by reference in its entirety. 

This can be accomplished, for example, by any of a number of techniques 
known in the art, such as, for example, immunofluorescence techniques employing a 
fluorescently labeled antibody (see below) coupled with light microscopic, flow 
cytometric, or fluorimetric detection. The term "labeled or labelable", with regard to 
the probe or antibody, is intended to encompass direct labeling of the probe or antibody 
by coupling (i.e., physically linking) a detectable substance to the probe or antibody as 
well as indirect labeling of the probe or antibody by reactivity with another reagent ^at 
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may, additionally, be employed histologically, as in immunofluorescence or 
immunoelectron microscopy, for in detection of Protease Mproteins. In situ - 
detection may be accomplished by removing a histological specimen from a patient, and 
applying: thereto a labeled antibody of the present invention. The antibody (or fragment) 
5 is preferably applied by overlaying the labeled antibody (or fragment) onto a biological 
sample. Through the use of such a procedure, it is possible to determine not only the 
presence of the Protease M protein, but also its distribution in the examined tissue. 
Using the present invention, one of ordinary skill will readily perceive that any of a wide 
variety of histological methods (such as staining procedures) can be modified in order to 

10 achieve such /n 5"/7m detection. 

One means for labeling an anti- Pra/eaje M protein specific antibody is 
via linkage to an enzyme and use in an enzyme immunoassay (EIA) (Voller, "The 
Enzyme Linked Immunosorbent Assay (ELIS A)" v Diagnostic Horizons 2:1-7,1 978, 
Microbiological Associates Quarterly Publication, Walkersville, MD; Voller, et al., J. 

15 Clin. Pathol. 31:507-520 (1978); Butler, Meth. Enzymol. 73:482-523 (1981); Maggio, 
(ed.) Enzyme Immunoassay, CRC Press, Boca Raton, FL, 1980; Ishikawa, et aL, (eds.) 
Enzyme Immunoassay, Kgaku Shoin, Tokyo, 1981), The enzyme which is bound to the 
antibody wall react with an appropriate substrate, preferably a chromogenic substrate,. in 
such a manner as to produce a chemical moiet>' which can be detected, for example, by 

20 spectrophotometric, fluorimetric or by visual means. Enzymes which can be used to 
detectably label the antibody include, but are not limited to, malate dehydrogenase, 
staphylococcal nuclease, delta-5-steroid isomerase, yeast alcohol dehydrogenase, alpha- 
glycerophosphate, dehydrogenase, triose phosphate isomerase, horseradish peroxidase, 
alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, 

25 urease, catalase, gIucose-6-phosphate dehydrogenase, glucoamylase and 

acetylcholinesterase. The detection can be accomplished by colorimetric methods which 
employ a chromogenic substrate for the enzyme. - Detection may also be accomplished 
by visual comparison of the extent of enzymatic reaction of a substrate in comparison 
with similarly prepared standards. 

30 Detection may also be accomplished using any of a variety of other 

methods. Antibodies may be labeled with radioactivity, fluorescent compounds (e.g., 
fluorescein isothiocyanate, rhodamine, phycoerythrin, phycocyanin, allophycocyanin, o- 
phthaldehyde and fluorescamine), chemilurainescent compounds (e.g., luminol, 
isoluminol, theromatic acridinium ester, imidazole, acridinium salt and oxalate ester), 

35 bioluminescent compounds (e.g., luciferin, luciferase and aequorin). 
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. . Moreover, it will be understood that any of the above methods for 
detecting alterations in a Protease M gent or gene product can be used to monitor the 
course of treatment or therapy. 

In anotherembodiment detection of Protease M is based on the detection 
a Pi-otease M Bioactivity, such as enzymatic activity. For example, serine protease 
substrate cleavage may be measured in a sample. Exemplary substrates include, gelatin, 
casein, or n-a-benzoyl-L-arginine ethyl ester (BAEE). 

In an exemplary embodiment, the invention provides a diagnostic method 
comprising: (i) contacting a tumor sample from a subject with an agent capable of 
detecting Protease M protein or mRNA; (2) determining the amount of Protease M 
protein or mRNA expressed in the tumor sample; (3) comparing the amount of Protease 
M protein or mRNA expressed in the tumor sample to a control sample; and (4)forming 
a diagnosis based on the amount of Protease M protein or mRNA expressed in the tumor 
sample as compared to the control sample. 

In a preferred embodiment of the detection method, the biological sample 
is a tumor sample. The tumor sample may comprise tumor tissue or a suspension of 
tumor cells. A tissue section, for example, a freeze-dried or fresh frozen section of 
tumor tissue removed from a patient,, can be used as the tumor sample. Moreover, the 
tumor sample may comprise a biological fluid obtained from a tumor-bearing subject. 
Protease M contains a signal sequence and thus is likely to be detectable in biological 
fluids. Following collection, tumor samples can be stored at temperatures below -20°C 
to prevent degradation until the detection method is to be performed. Preferred tumor 
samples in which Protease M mRNA or protein is to be detected are mammary tumor 
samples and/or ovarian tumor samples. Primary malignancy of the tumor cell sample 
can be diagnosed based on an increase in the level of expression of Protease M mRNA ' 
or protein in the tumor sample as compared to the control. In another embodiment, the 
control is from normal cells or a primary tumor and the tumor sample is a suspected 
metastatic tumor sample. Acquisition of the metastatic phenotype by the suspected 
metastatic tumor sample can be diagnosed based on a decrease in the level of, or absence 
of. Protease M mRNA or protein in the tumor sample compared to the control. The 
detection method of the invention can be used to detect Protease M mRNA or protein in 
a biological sample in vitro as well- as in vivo. For example, in vitro techniques for 
detection of Protease M mRNA include Northern hybridizations and in situ 
hybridizations. In vitro techniques for detection of Protease M protein include enzyme 
linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations and 
immunofluorescence. Alternatively, Protease M protein can be detected in vivo in a 

c..h;^,-t h., ix^troHnrino into the cjnKJ^rt p lah^l^H ,r,t;-Prot^««^ M antiK^H,, 1» 
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the antibody can be labeled with a radioactive marker whose presence and location in a 
subject can be detected by standard imaging techniques. 

The methods described herein may be performed, for example^ by 
utilizing pre-packaged diagnostic kits comprising at least one probe nucleic acid or 
5 antibody reagent described herein, which may be conveniently used, e.g.^ in clinical 
settings to diagnose patients exhibiting symptoms or family history , of a disease or 
illness involving a Protease M gene. For example, the kit can comprise a labeled or 
labelable agent capable of detecting Protease M protein or mRNA in a biological 
sample; means for determining the amount of Protease M in the sample; and means for 
1 0 comparing the amount of Protease M in the sample with a standard. The agent can be 
packaged in a suitable container. The kit can further comprise instructions for using the 
kit to detect Protease M mRNA or protein. 

5. Therapeutic Uses 

15 , Another aspect of the invention pertains to methods of modulating 

^Protease M bioactiyity associated v^th a cell, e.g., for therapeutic purposes. Protease M 
■activity "associated with a cell" is intended to include Protease M activity within the 
■cell, secreted by the cell and in the extracellular milieu surrounding the cell. The 
:modulatory method of the invention involves . contacting the cell with an agent that 

20 .modulates Protease M activity associated with the cell. In one embodiment, the agent 
stimulates Protezise M serine protease activity. Examples of such stimulatory agents 
include active Protease M protein agonists and a nucleic acid molecule encoding 
Protease M that has been introduced into the cell. In another embodiment, the agent 
inhibits the Protease M activity . Examples of such inhibitory agents include antisense 

25 Protease M nucleic acid molecules, anti-Protease M antibodies. These modulatory 
methods can be performed in vitro (e.g., by culturing the cell with the agent) or, 
alternatively, in vzvo (e.g., by administering the agent to a subject). 

Stimulation of Protease M bioactivity is desirable in situations in which 
Protease M is abnormally dovmregulated and/or in which increased Protease M activity 

30 is likely to have a beneficial effect. One example of such a situation is in tumor cells, 
and in particular metastatic tumor cells. As demonstrated in the appended Examples, 
acquisition of a metastatic phenotype by tumor cells is associated with dovraregulation 
of Protease M expression. Thus, increasing the expression and/or activity of Protease M 
in or around the tumor cells is expected to inhibit the development or progression of the 

35 metastatic phenotype. Accordingly, in a specific embodiment, the invention provides a 
method for inhibiting development or progression of a tumorogeinc phenotype in a cell 
comprising contacting the tumor cell with an agent which elevates the amount of 
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Protease M associated with.the cell. The agent that elevates Protease M in or around the 
tumor cell can be Protease M protein itself. For example, since Protease M is likely to 
be a secreted protein, it is likely that it exerts tumor suppressive effects extracellularly. 
Thus, Protease M, preferably in a phaimacemically acceptable carrier, can be 
administered to a tumor-bearing subject by an appropriate route to inhibit the 
development, or progression of a proliferative disorder. Suitable routes of administration 
include intravenous, intramuscular or subcutaneous injection, injection directly into the 
tumor site or implantation of a device containing a slow-release formulation. The 
Protease M preparation can also be incorporated into liposomes or other carrier vehicles 
to facilitate delivery to the tumor site. A non-limiting dosage range is 0.001 to 100 
mg/kg/day, with the most beneficial range to be determined by routine pharmacological 
methods. 

Alternative to administration of Protease M protein or agonist itself, the 
development of or progression of cancer in a cell may be slowed by modifying them to 
express Protease M by introducing into the cells a nucleic acid encoding Protease M 
(e.g., via a recombinant expression vector). Expression vectors suitable for gene 
therapy, including retroviral and adenoviral vectors carrying appropriate regulatory 
elements, can be used to deliver the Protease M-encoding nucleic acid to the tumor cells. 

The ability of Protease M protein or DNA to inhibit tumor progression 
and/or metastasis can be evaluated using in vivo and in vitro assays known in the art. 
For example, a suitable in vivo assay utilizes the mammary epithelial tumor cell line 
MDA-MB-435, which forms tumors at the site of orthotopic injection and metastasizes 
in nude mice (describe further in Price et al. (1990) Cancer Res. 50:717). MDA-MB- 
435 cells, which do not express detectable Protease M mRNA, can be transfected with a 
Protease M expression vector and stable transfectants can be selected. These 
transfectants can then be injected into nude mice. At 10- weeks post-inoculation, the 
mice are sacrificed and their tumors are excised and weighed to determine the effect of 
Protease M expression on tumor progression and metastasis. A suitable in vitro assay is 
tumor cell invasion through reconstituted basement membrane matrix (e.g., Matrigel) as 
described in Hendrix et al. (1987) Cancer Letters 38:137. The invasive ability of 
Protease M-transfected MDA-MB-435 cells can be compared to untransfected MDA- 
MB-435 cells to determine the effect of Protease M expression on tumor invasiveness. 

In contrast to the foregoing situations in which stimulation of Protease M 
activity is desirable, there are other situations in which it may be desirable to decrease 
Protease M activity using an inhibitory method of the invention. For example, as 
demonstrated herein. Protease M mRNA expression is markedly upregulated in certain 
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primary tumor cells. Thus, . inhibiting the expression or^activity of Protease M in cells 
may be useful for inhibiting or reducing carcinogenesis. 

C. Drug Screening Assays ... 
5 Furthermore, by making available purified and recombinant Protease M 

polypeptides, the present invention facilitates the development of assays which can be 
used to screen for drugs, including Protease homblogs, which are either agonists or 
antagonists of the normal cellular function of the subject Pro/ea^e M polypeptides, or of 
their role in the pathogenesis of cellular differentiation and/or proliferation and disorders 

10 related thereto. In one embodiment, the assay evaluates the ability of a compound to 
modulate binding between a Pro/eaye M polypeptide and a molecule, be it protein or 
DNA, that interacts either upstream or downstream of the Protease M polypeptide in the 
TGFb signaling pathway. A variety of assay formats will suffice and, in light of the 
present inventions, will be comprehended by a skilled artisan. , ■ ' 

15 : 
1. Cell-free assays 

In many drug screening programs which test libraries of compounds and 
natural extracts, high throughput assays are desirable in order to maximize the number 
of compounds surveyed in a given period of tirne. Assays which are performed in cell- 

20 free systems, such as may be derived with purified or semi-purified proteins, are often 
preferred as "primary" screens in that they can be generated to permit rapid development 
and relatively easy detection of an alteration in a molecular target which is mediated by 
a test compound. Moreover, the effects of cellular toxicity and/or bioavailability of the 
test compound can be generally ignored in the in vitro system, the assay instead being 

25 focused primarily on the effect of the drug on the molecular target as may be manifest in 
an alteration of binding affinity with upstream or downstream elements. 

In an exemplary screening assay of the present invention, the compound 
of interest is contacted with Protease M and a molecule which interacts with Protease M 
(including both activators and repressors of its activity), such as a substrate. Detection 

30 and quantification of complexes of Protease M with it's binding protein provide a means 
for determining a compound*s efficacy at inhibiting (or potentiating) complex formation 
between Protease M and the Protease M-binding elements. The efficacy of the . 
compound can be assessed by generating dose response curves from data obtained using 
various concentrations of the test compound. Moreover, a control assay can also be 

35 performed to provide a baseline for comparison. In the control assay, isolated and 

purified Pro/ea^e A/ polypeptide is added to a composition containing the Protease M- 
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binding elenient, and the formation of a complex is quantitated in the absence of the test 
compound. . . 

Complex formation between the Protease A/ polypeptide and a Protease 
Af binding element may be detected by a variety of techniques! Modulation of the 
foraiation of complexes can be quantitated using, for example, detectably labeled 
proteins such as radiolabeled, fluorescently labeled, or enzymatically labeled Protease 
M polypeptides, by immunoassay, or by chromatographic detection. 

For example, modulators of Protease M activity may be identified in a 
method wherein Protease M, a substrate for the serine protease, and a test substance are 
incubated under conditions suitable for the serine protease to cleave the substrate. 
Cleavage of the substrate is then measured and the amount of cleavage of the substrate 
in the presence of the test substance is compared to the amount of cleavage of the 
substrate in the absence of the test substance. The test substance can then be identified 
as a modulator of Protease M activity based on this comparison. For example, when the 
amount of cleavage of the substrate in the presence of the test substance is less than the 
amount of cleavage of the substrate in the absence of the test substance, the test - 
substance can thereby be identified as a stimulator of the Protease M activity. 
Alternatively, when the amount of cleavage of the substrate in the presence of the test 
substance is greater than the amount of cleavage of the substrate in the absence of the 
test substance, the test substance can thereby be identified as an inhibitor of the Protease 
M activity. 

Typically, it will be desirable to inunobiiize either Protease M or its 
binding protein to facilitate separation of complexes from uncomplexed forms of one or 
both of the proteins, as well as to accommodate automation of the assay. Binding of 
Protease M to a binding protein, in the presence and absence of a candidate agent, can 
be accomplished in any vessel suitable for containing the reactants. Examples include 
microtitre plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion 
protein can be provided which adds a domain that allows the protein to be bound to a 
matrix. For example, glutathione-S-transferase/Pro/^a.ye (GST/Pro/^a^e M) fusion 
proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, 
MO) or glutathione derivatized microtitre plates, which are then combined with the cell 
lysates, e.g. an 35S-labeled, and the test compound, and the mixture incubated under 
conditions conducive to complex formation, e.g. at physiological conditions for sah and 
pH, though slightly more stringent conditions may be desired. Following incubation, the 
beads are washed to remove any unbound label, and the matrix immobilized and 
radiolabel determined directly (e.g. beads placed in scintilant), or in the supernatant after 
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dissociated from the matrix, separated by SDS-PAGE, arid the level of Protease M- 
binding protein found in the bead fraction quantitated from the gel using standard 
electrophoretic techniques such as described in the appended examples. 

Other techniques for immobilizing proteins on matrices are also avaflable 
5 for use in the subject assay. For instance, either Protease Mor its cognate binding 
protein can be immobilized utilizing conjugation of biotin and streptavidin. For 
instance, biotinylated Protease M molecules can be prepared from biotin-NHS (N- 
hydroxy-succinimide) using techniques well known in the art (e.g., biotinylation kit, 
Pierce Chemicals, Rockford, IL), and immobilized in the wells of streptavidin-coated 96 

10 well plates (Pierce Chemical). Altematively, antibodies reactive with Protease Mbut 
which do not interfere with binding of Protease M and a binding protein (BP) can be 
derivatized to the wells of the plate, and Protease trapped in the wells by antibody 
conjugation. As above, preparations of a Protease A/-binding protein and a test 
compound are incubated in the Protease M-presenting wells of the plate, and the amount 

15 of complex trapped in the well can be quantitated. Exemplary methods for detecting 
such complexes, in addition to those described above for the GST-immobilized 
complexes, include immunodetection of complexes using antibodies reactive with the 
Protease M binding element, or which are reactive with Protease M protein and compete 
with the binding element; as well as enzyme-linked assays which rely on detecting an 

20 enzymatic activity associated with the binding element, either intrinsic or extrinsic 
activity. In the instance of the latter, the enzyme can be chemically conjugated or 
provided as a fiision protein with the Protease M-BP. To illustrate, the Protease A/-BP 
can be chemically cross-linked or genetically fused with horseradish peroxidase, and the 
amount of polypeptide trapped in the complex can be assessed with a chromogenic 

25 substrate of the enzyme, e.g. 3,3'-diamino-benzadine terahydrochloride or 4-chloro-l- 
napthol. Likewise, a fusion protein comprising the polypeptide and glutathione-S- 
transferase can be provided, and complex formation quantitated by detecting the GST 
activity using 1 -chloro-2,4-dinitrobenzene (Habig et al (1974) J Biol Chem 249:7130). 

For processes which rely on immunodetection for quantitating one of the 

30 proteins trapped in the complex, antibodies against the protein, such as ?^i\-Protease M 
antibodies, can be used. Alternatively, the protein to be detected in the complex can be 
"epitope tagged" in the form of a fusion protein which includes, in addition to the 
Protease M sequence, a second polypeptide for which antibodies are readily available 
(e.g. from commercial sources). For instance, the GST fusion proteins described above 

35 can also be used for quantification of binding using antibodies against the GST moiety. 
Other useful epitope tags include myc-epitopes (e.g., see Ellison et al. (1991) J Biol 
Chem 266:21 150-21157) which includes a 10-residue sequence from c-myc, as well as 
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the pFLAG system (International Biotechnologies, Inc.) or the pEZZ -protein A system 
(Pharamacia, NJ). 

2. Cell based assays 

5 In addition to cell-free assays, such as described above, the readily 

available source of mammalian Protease A/ proteins provided by the present invention 
also facilitates the generation of cell-based assays for identifying small molecule 
agonists/antagonists and the like. For example, cells can be caused to overexpress a 
recombinant Protease A/ protein in the presence and absence of a test agent of interest, 

1 0 and the assay scored for modulation in Protease M bioactivity in the target cell 
mediated by the test agent. As with the cell-free assays, agents which produce a 
statistically significant change in Protease A/-dependent responses (either inhibition or 
potentiation) can be identified. In an illustrative embodiment, the expression or activity 
of a Protease M is modulated in or cells and the effects of compounds of interest on the 

1 5 readout of interest (such as tumorigenesis or metastatic potential) are measured. 

In another embodiment, modulators of Protease M expression are 
identified in a method wherein a ceil is contacted with a test substance and the 
expression of Protease M mRN A or protein in the cell is determined. The level of 
expression of Protease M mRNA or protein in the presence of the test substance is 

20 compared to the level of expression of Protease M mRNA or protein in the absence of 
the test substance. The test substance can then be identified as a modulator of Protease 
M expression based on this comparison. For example, when expression of Protease M 
mRNA or protein is greater in the presence of the test substance than in its absence, the 
test substance is identified as a stimulator of Protease M mRNA or protein expression. 

25 Alternatively, when expression of Protease M mRNA or protein is less in the presence of 
the test substance than in its absence, the test substance is identified as an inhibitor of 
Protease M mRNA or protein expression. The level of Protease M mRNA or protein 
expression in the cells can be determined by methods described above for detecting 
Protease M mRNA or protein. Alternatively, the regulatory regions of a Protease M 

30 gene, e.g., the 5' flanking promoter and enhancer regions, may be operably linked to a 
detectable marker (such as luciferase) which encodes a gene product that can be readily 
detected. 

Monitoring the influence of compounds on cells may be applied not only 
in basic drug screening, but also in clinical trials. In such clinical trials, the expression 
35 of a panel of genes may be used as a "read out" of a particular drug's therapeutic effect. 

In yet another aspect of the invention, the subject Protease M 

1 J — 1 — J — _ Hi 1 1. „: jii f -C, 1 . T T o T-» „ * * 
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No. 5,283,317; Zeryos et al. (1993) Cell 72:223-232; Madura et al.<1993) J Biol.Chem ^ 
268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. 
(1993) Oncogene 8:1693-1696; and Brent WO94/10300), for isolating coding sequences 
for other cellular proteins v^4iich bind to or interact with Protease M (P Protease M- 
5 binding proteins" or Protease M-bp"), such as a substrate. Such protease M-binding 
proteins would likely also be involved in the development of carcinogenesis or 
metastases. 

Briefly, the two hybrid assay, relies on reconstituting in vivo a functional 
transcriptional activator protein from two separate fusion proteins. In particular, the 

10 method makes use of chimeric genes which express hybrid proteins. To illustrate, a first 
hybrid gene comprises the coding sequence for a DNA-binding domain of a 
transcriptional activator fused in frame to the coding sequence for a Protease M 
polypeptide. The second hybrid protein encodes a transcriptional activation domain 
fused in frame to a sample gene from a cDNA-library. If the bait and sample hybrid 

1 5 proteins are able to interact, e.g., form a Protease JW-dependent complex, they bring into 
close proximity the two domains of the transcriptional activator. This proximity is 
sufficient to cause transcription of a reporter gene which is operably linked to a 
transcriptional regulatory site responsive to the transcriptional activator, and expression 
of the reporter gene can be detected and used to score for the interaction of the Protease 

20 A/and sample proteins. 

The present invention is further illustrated by the following examples 
which should not be construed as limiting in any way. The contents of all cited 
references (including literature references, issued patents, published patent applications 

25 as cited throughout this application are hereby expressly incorporated by reference. 

The practice of the present invention will employ, unless otherwise 
indicated, conventional techniques of cell biology, cell culture, molecular biology, 
transgenic biology, microbiology, recombinant DNA, and immunology, which are 
within the skill of the art. Such techniques are explained fully in the literature. See, for 

30 example, Molecular Cloning A Laboratory Manual^ 2nd Ed., ed. by Sambrook, Fritsch 
and Maniatis (Cold Spring Harbor Laboratory. Press: \9^9)\ DNA Cloning, Volumes I 
and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M. J. Gait ed., 1984); MuUis 
et ah U.S. Patent No: 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. 
Higgins eds. 1984); Transcription And Translation {B, D. Hames & S. J. Higgins eds. 

35 1984); Culture Of Animal Cells (R. I, Freshney, Alan R. Liss, Inc., 1987); Immobilized 
Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular 
Cloning (1984); the treatise. Methods In Enzymology (Academic Press, Inc., N.Y.); Gene 
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Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold 
Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.). 
Immunochemical Methods In Cell And Molecular Biology (K'layer and Wedker, eds.. 
Academic; Press, London, 1987); Handbook C>f Experimentaf Immunology, Volumes I- 
5 IV (D. M. Weir and C. C. Blackwell, eds., 1986); Manipulating the Mouse Embryo, 
(Gold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). 

Exemplification 

10 Example 1. 

Materials and Methods Used in the Example. 
Mammary Cell Strains and Lines 

Normal human mammary epithelial cell strains (70N and 76N) were 

1 5 derived from reduction mammoplasties in this lab as described (Band V, Sager R. ( 1 
989) Proc, Natl Acad Sci. USA86: 1249-1253.), Primary (21PT, 21NT) and metastatic 
(21MT-1, 21MT-2) tumor lines were established in this lab from a single patient as 
described ( Band V and Sager R. ( 1 989) Proc. Natl Acad. Set USA86:1 249-1 253; ■■■ 
Band V, et al. (1990) Cancer Res. 50:735 1 -7357). Human mammary epithelial tumor 

20 cell lines MCF-7, T47D, ZR75-1, BT549, MDA-MB-157, MDA-MB-23 1 , MDA-MB- 
435, MDA-MB436, MDA-MB-361, and BT-474 were obtained from American Tissue 
Culture Collection (Rockville, MD). Cells were grown in DFCI-1 media (Schachter M. 
(1980) Pharmacol Rev. 31; 1-1 7) and harvested at approximately 70% confluence for 
RNA isolation and when near confluent for DNA isolation. 

25 

Prostate Cell Lines 

Normal, immortalized prostate epithelial cell lines: CF3 (HPV 
immortalized), CF91 (SV40 immortalized), and MLC (SV40 immortalized) were used 
in experiments. The tumor cell lines DU145, LNCaP, and PC3 (American Tissue 
30 Culture Collection, Rockville, MD) were also used. 

Ovarian Cell Cultures and Tissues 

The primary human ovarian surface epithelial cell cultures (HOSE 10/11, 

16, and 21) were established from the ovarian surface epithelium as described previously 

35 (Tsao SW, Mok SC, Fey E, et al. (1995) Exp. Cell Res. 218:499-507). Immortalized 

ovarian surface epithelial cells (HOSE6.3E6E7) was obtained by infecting the HOSE 

^ ^^t^^,,;^,,, t voxti ^x:^x:n — ;u^,a /"t^^^ 
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SW, et al. (1995). Exp. Cell Res. 218:499-507). The eight ovarian carcinoma eell lines 
used for this comparative study include DOV13, OVeA420, OVCA429, OVCA432, and 
OVCA433, which were established in the laboratory of Gynecologic Oncology; CA0V3 
and SKOV3 which were purchased from ATCC (Rockville, MD); and OVCA3 vdiich 
5 was obtained from the National Cancer Institute (Frederick, MD). . 

Ovarian tumors were obtained from consenting patients at the Brigham 
and Women's hospital in Boston as described previously (Mok SC, et al. (1992) Cancer 
Res. 52:51 19-5122). These include six borderline ovarian tumors (354A, 373 A, 395A, 
405A, 466A, and 469A); twenty stage III/IV high grade invasive ovarian 
1 0 adenocarcinomas from the primary ovarian site; two metastatic adenocarcinonia from 
colon primary tumors (327 A, 339A) and three normal ovaries (366N, 379N, and 465N). 

Differential Display of mRNA 

Total cell RNAs (50mg) from 21PT and 21MT-l.were treated with 

15 DNasel (Worthington DPRF) in the presence of Rnasin.ribonuc lease inhibitor 

(Promega) to remoye residual. DNA contamination as described. elsewhere (Sager R, et 
al. (1993). FASEBJ. 7: 964-970). Differential display of the mRNA was performed as 
described ( Liang L, Pardee AB, (1992) Science 257:967-970; Liang L, Averboukh L, 
Pardee AB. (1993) Nucleic Acids Res. 21 :32673275). Basically the RNAs were, reverse 

20 transcribed using the 3 '-anchored primer T12MG (where M is a mixture of A, G, or C). 
The resultant cDNAs were then PGR amplified in the presence of 35S-dATP using 
T12MG and the arbitrary primer OPAl (CAGGCCCTTC) and run side by side on a 6% 
sequencing gel. Differentially displayed bands \yere recovered from the dried gel, 
reamplified by PGR, ^^P-labeled by the oligo method (Feinberg AP and Vogelstein B. 

25 (1983) Anal. Biochem. 132:6-13) and used as a probe on Northern strips prepared with 
21PT and 21MT-1 total RNA to confirm the result obtained by differential display. 

Cloning and Sequencing of Partial and Full-Length cDNAs and Analysis 

The reamplified band, from differential display was cloned into the TA 

30 cloning vector PGRII (Invitrogen) and sequenced on both strands using T7 and SP6 
primers. cDNA libraries from 21PT and 76N cells constructed in Lambda Zap II 
(Stratagene, San Diego, GA) were screened using the cloned PGR product as a probe 
and several cDNA clones were isolated and sequenced on both strands. The longest 
cDNA clone (from the 76N library) was sequenced on both strands using an ABI 

35 automated sequencer Model 373A by the Dana Farber Molecular Biology Gore Facility. 
Oligonucleotides used for sequencing were synthesized by the Dana Farber Molecular 
Biology Gore facility or by Amitof, Inc. (Cambridge, MA). The predicted protein 
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. coding region and non-translated regions were determined and formatted using the 
GCG Publish program. The predicted protein sequence was compared to protein 
databases using the Blast algorithm (Altschul SF, (1990) J. Mol Biol 215:403-410). 
Protein alignment with related proteins performed on GCG using the Pileup, Distances, 
5 and Prettyplbt programs. 

Northern and Southern Analysis 

Total cell RNA was isolated by the guanidinium isothiocyanate/cesium 
chloride method and analyzed on.Northem blots as previously described (Sambrook J, 

10 (1989) Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY,). 36B4 (Masiakowski P, Breathnach R, Bloch J, et al. (1982) 
NucL Acids Res. 1 0:7895-7903), a ribosomal protein whose message is constant under a 
variety of conditions, was used to normalize the blots. Total cellular DNA was isolated 
and analyzed on Southern blots as described (Sambrook J, (1989) Molecular Cloning. A 

15 Laboratory Manual. Gold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.). 
Densitometric analysis of autoradiographs was performed with an imaging densitomer 
(Biorad GS-700) using the Molecular Analyst software. 

Production of polyclorial antibody and western blotting 

20 The MAP peptide (multiple antigen peptide) (Tam JP. ( 1 988) Proc. Natl . 

Acad Sci. USA 85:5409-5413.) '73gKNNLRQRESSQEQS87 (0.5mg) was emulsified 
with an equal volume of Freund's adjuvant and injected into 3 to 9 month old New 
Zealand white rabbits. Boosts were done 2 and 6 weeks later. The animals were bled 
and serum was collected and stored at -20° C. Peptide and antibody production was 

25 done at Research Genetics, Huntsville, AL. 

Whole cell lysates were prepared by sonicating 1 0^ cells/ml for 20, 30 
second pulses in a Sonicator Ultrasonic Processor in mammalian lysis buffer. 
(4mMNaHC03^ lOOmM NaP, 20mMKH2PO4^ 2mM Sodium orthovanadate, 5mM 
EDTA, 5mM disoflurophoshate, 2mM PMSF, 2 mg/ ml leupeptin, 2mg/ml aprotinin, pH 

30 7.2). Lysates were clarified by spinning at 14,000 x g for 30 minutes in a microfuge 
(Eppendorf). 

50 to 100 mg of cell lysate was denatured by heating in SDS-PAGE 
sample buffer (50mM tris-HCl, pH6.8, O.lmMDTT, 2% SDS, 0.1% bromphenol blue, 
10% glycerol) at 90°C for 5 minutes and run on a 12% premade acrylamide/ SDS 
35 minigel (biorad), electroblotted onto a PDVF membrane (0.2m, Biorad) and reacted with 
immune serum (1 : 1000). Anti-rabbit IgG horseradish peroxidase linked whole antibody 
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(Amersham) (1:2000) was used as secondary antibody,' arid inuniinoreactive bands were 
detected with the ECL (enhanced chemiluminescence, Amersham). ^ * 

Expressi on of GST Fusion Protein 
5 The fiill length cDNA clone was PGR amplified using the sense 5' 26- 

mer oligonucleotide ^'gGAATTCCGTTGGTGCATGGCGGACC^' and the antisense 
3' oligonucleotide ^'gTCGGAATTCAGGGTCACTTGGCCTG^' at 95^C, 1 minute, 
60^C, 1 minute, 72^C, 1 minute for 30 cycles to yield a 0.7 kb product which contained 
the open reading frame without the hydrophobic n-terminal amino acids. The resultant 

10 PGR product encoding for leu^^ lys244 digested with EcoRl and ligated to 

alkaline phosphatase treated EcoRI linearized pGEX-2t vector (Pharmacia) to produce 
plasmids encoding a GST-Protease M fusion protein. E. Coli strains XL-1 blue or 
DH5a transformed with this construct were grown and induced with 0.2mM IPTG at 37^ 
C for one hour to produce GST fusion protein which was solubilized from bacteria and 

15 purified on glutathionine agarose beads by standard methods (Smith DB, Johnson KS. 
(£l988)Ge«e 67:31-40). 

Expression of Baculovirus Recombinant Protein 

A full length cDNA clone was cut with EcoNl and BstXl to give a 

20 fragment which spanned nucleotides 233 to 1019. This fragment was incorporated into 
the baculovirus transfer vector pVL1392 (Pharmingen). Generation and amplification of 
recombinant baculovirus was as described (23,24). For production of Protease M 
.Spodoptera Frugiperda (cell line sf9) was infected -with amplified recombinant virus to 
obtain nearly 100% infection as gauged by enlarged cells. 96 hours post-infection, cells 

25 were harvested and lysed by sonication in mammalian lysis buffer followed by adjusting 
to 500mM NaCl and rocking for one hour at 4^C. All subsequent purifications were 
done at 40c. 

The lysate was adjusted to 125mM NaCl, loaded onto p- 
aminobenzamidine agarose (Sigma A7155), washed with loading buffer, and eluted v^th 
30 (25mMNaP04, 0.02% NaN3, 500mM NaCl, lOmM benzamidine), pH 6.0. The eluted 
fractions were loaded onto concanavalin A agarose (Sigma C8402) by rocking . for 1 
hour, washed with (25mMNaP04^ 0.02% NaN3, 500 mM NaCl), pH 6.0, and eluted in 
wash buffer containing 10% methyl-a-D-mannopyranoside (Sigma M6882). 

35 Assays for Protease Activity 

. Gelatin and casein zymography was performed essentially as described 
(Heusen C and Dowdle EB. (19S0) Anal. Biochem. 102:196-202; Wilson MJ, et al.. 
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(1993) Journal of Urology 149:65 3 -65 8.). Samples were run on 10% 
polyacryamide/0. 1% SDS gels containing 1% gelatin or casein, soaked in 2.5% triton at 
room temperature for 1 hour, and in O.IM glycine, pH 83 at 37^C, 5 to 16 hours. After 
staining in amido black areas of proteolysis appear as cfear areas against fte blue-black 
background. Trypsin (Sigma T8642) was used as a positive control. 

Protease activity was also determined by monitoring the cleavage of n-a- 
benzoyl-L-arginirie ethyl ester (BAEE) (Sigma B-4500. Reactions were set up in 
(25mMNaP04, l^iM EDTA, and ImM BAEE), pH 7.5. Samples were added and the 
change in absorbance at 260nm was monitored on the Beckman DU-6 
spectrophotometer in the time-drive mode. Trypsin was used as a positive control. 

Expression Vector Construct and Transfection 

A fiill length cDNA clone was cut with EcoNl and BstXl to give a 
fragment which spanned nucleotides 233 to 1019. This fragment was incorporated into 
pCMVneo plasriiids (Tomasetto C, et al. (1993) J. Cell BioL 122:57-167) and checked 
for correct orientation of the insert. 5x10^ MDA-MB435C cells were electroporated at 
220V with lOiiig of this construct in the presence of lOmg/ml DEAE dextran. Vector 
alone was used as a negative control. 10^ cells were plated m five PlOO dishes in Alpha 
-5% PCS. After 14 days of selection in media containing 1 mg/ml G418, the transfected 
clones were refed with media containing 0.5mg/ml G41 8 for an additional week. Clones 
were picked in cloning cylinders, expanded and maintained in Alpha-5%o PCS 
containing 0.5mg/ml G4 18: 

RESULTS 
Differential Display 

Total RNAs from 21 PT and 21 MT-1 cell lines were compared by 
differential display. Approximately 100 bands appeared in each lane of each primer pair 
tested, and on the average 2-3 bands were differentially expressed. One of the bands 
that was overexpressed in the 21 PT lane (with primer pair OPA 1/T12MG) (280 bp in 
Figure 1 A) was excised from the gel and PCR amplified. The resulting 280 bp PCR 
product was used to probe a northern blot (Figure IB). Two bands were detected; a 
band of 1 .7kb which was very high in 21 PT and barely detectable in 21 MT-1, and a 
band of approximately 1 kb which was equal in both lanes. This mixture of two clones 
was purified and the clone which hybridized only to the differentially expressed 1 .7 kB 
message was recovered. 
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Protease M: Sequence Identification - . . , 

.The 0.28 kb. insert was used to screen a 76N cDNA library constructed in 
121apIL The longest clone isolated was. sequenced in its entirety. This clone is 1^26 nt 
in length and contains 245bp of 5^nt sequences, 732 bp of coding sequences (coding for 
5 a postulated protein of 244 aa), and 549 bp of 3*nt sequences.. (Figure 2) The 

presumptive protein coding region begins with an ATG codon, which is in a good Kozak 
consensus sequence (Kozak M. (1984) Nucleic Acids Research 12:857872) 
CGGCCATGA, and ends with a TGA translation stop codon. The amino terminal 
portion of the postulated protein has 13 consecutive hydrophobic residues (leu4 to ala'^^) 

10 which is characteristic of a signal peptide followed by glu ^ ^-glu-gln-asn-Iys^ ^ which 

resembles a pro-peptide with a potential trypsin susceptible cleavage site after lys^^ . A 
potential N-linked glycosylation site is found at asn in the 3'nt region, the 

expected polyadenylation signal AATAAA is found 1 1 base pairs upstream of the poly 
A tail at 1,490 bp. Another polyadenylation signal AATAAA was found at 1,095 bp. 

15 The postulated protein sequence, compared to proteins in the database 

using the blast program, was similar to other proteins of the serine protease family. The 
postulated sequence was compared to the four most closely related proteins using the 
pileup program and distances program and displayed by the prettyplot program and was 
found to be novel. (Figure 3).. . . ' / , ~ ' - 

20 ; - _ : 

Expression, of mRNA in mammary and prostate cells . , 

Figure 4A shows the results of northern blots of mammary cell lines and 
strains. The two normal cells strains shown (76N and 70N) and another normal cell 
strain 8 IN (not shown) expressed the L7kb Protease M message at low levels. Two 

25 primary tumor lines (21 PT and 21 NT) as well as one metastatic line jfrom the same 
patient (21 Nfr-2) expressed high levels of message (approximately 20 to 100 fold 
higher than the normal strains). However, the most metastatic cell line from the same 
patient (21 MT-1) expressed low levels of RNA (see Figure 1 A). One other primary 
tumor cell line (BT474) and nine other metastatic cell lines (MCF-7, T47D, ZR-75-1, 

30 MDA-MB-157, MDA-MB-231, MDA-MB-361, MDA-MB-435, MDA-MB-436, and 
BT549) had no detectable message. Figure 4B shows northem blots of prostate cell 
lines. The normal, immortalized cell strains CF3 and CF91 express moderate levels of 
Protease M mRNA while another normal inmiortalized strain, MLC expresses just trace 
amounts. In contrast all three of the tumor cell lines examined (DU145, LNCaP, and 

35 PC3) failed to express any Protease M message. 
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Expression of mRNA in ovarian cell lines and tissue 

A series of normal immortalized and primary tumor derived ovarian cell 
lines were examined for expression of mRNA for Protease M on northern blots. The 
message was not expressed in any of the five normal immortalized cell lines, but was 
5 detected in five of the eight primary tumor cell lines examined (not shown). We also 
examined the RNA from a series of normal ovarian tissue and biopsies from primary 
tumors (one of the two northern blots is shown (Figure 5). While mRNA was not 
expressed in the three normal tissues examined, the six borderline ovarian tumor tissues, 
and the two metastatic tumors from colon primaries, it was expressed in the primary 
1 0 ovarian tumor tissue in sixteen of the twenty specimens examined. 

Expression of Protease M mRNA in normal human tissue 

A blot containing 2 mg-of polyA"^ RNA from eight normal human tissues 
(Clontech) was examined for expression of Protease M (Figure 6). While the message 
15 was not detected in heart, placenta, lung, liver, or skeletal muscle, high levels of 

message were detected in brain, kidney, and pancreas. The message detected in brain 
and kidney was 1.7 to 1.8kb, but the message detected in pancreas was only about 1.2kb. 
A probable explanation for the smaller message in pancreatic RNA would be the use of 
the alternative polyadenylation signal at 1090 bp noted in Figure 1 . 

20 

Production of polyclonal antibody and its use to study expression of protein in 
mammary cell lines and strains 

A polyclonal antibody was produced in rabbits against a hydrophilic 
peptide which was not highly conserved among other serine proteases 

25 (73gKHNLRQRESSQEQS87), The western blot (Figure 7) shows that the antibody 
detects a protein of 37kd in total cell lysates of the normal mammary epithelial cell 
strain 8 IN, and in the primary tumor cell line 2 INT. No protein is detected in the 
metastatic breast cell line MDA-MB-435. In other western blots (not shown) the 
antibody detected a 37kd protein in the normal strains 70N and 76N, as well as the 

30 primary tumor cell line 21PT, but not in the metastatic cell lines T47D and MCF-7. Up 
to one ml of conditioned media from 70N and 2 INT was examined in western blots 
probed with this antibody and no reacting proteins were detected (not shovra). This 
result suggests that the protein is primarily localized intracellularly and not secreted. 
The protein detected by the antibody is 37kd while the amino acid sequence predicts a 

35 protein of about 27kd. The potential glycosylation site at (13^asn-thr-thr^36) might 
explain this size discrepancy. 
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Table 1 shows that the RNA levels for.the serine protease are not always 
correlated with the protein levels. While the primary tumor cell lines (2 INT and 21PT) 
have 20 to 100 times more Protease M mRNA than normal cell strains (70N, 16N^ and 
8 I^^), the protein detected on westerns is equal to or somewhat Ibwer for the primary 
5 tumor cell lines than in the normal cell strains. 

The antipeptide polyclonal Protease M antibody has been used 
successfully in western blots but does not seem to work in cellular immunofluorescence 
studies in which the antibody gave a high background with MDA-MB-435 cells which 
do not express the Protease M message, . . 

10 ..... 
Production of Recombinant Protein 

Extensive efforts were made to produce recombinant protein for. further 
study of the protease. As briefly discussed below, neither production in E. coli as a 
GST-fusion protein nor in baculovirus as a pure protein were successful in providing 
15 more than minimal amounts of the protease. The products which were recovered were 
used primarily to verify the specificity of the antibody preparations. 

\ ■ In a further effort to obtain recombinant protein, transfectants were 
produced expressing Protease M in the mammary tumor cell line MDA-MB-435, 
Transfectants were screened initially for protein production, and as shown below, the 
20 results demonstrated that only 5 of the 76 transfectants produced any protein and this 
was at low levels. . , 

Production of GST fusion protein and assay for protease activity 

The expected 52kd GST/Protease M fusion protein was purified and 

25 yielded approximately 600 mg of fusion protein per 500 ml culture. We were able to 
cleave the fusion protein by incubation with thrombin but the Protease M fragment was 
degraded, even at limiting dilutions, while only the GST portion stayed intact. When we 
ran the fusion protein on western blots, we needed at least Img to get a detectable signal. 

Up to Img of GST/Protease M fusion protein was run on casein and 

30 gelatin zymograms (Heusen C, Dowdle EB. {1980} AnaL Biochem. 102:196-202; 

Wilson MJ, et al. (1993) Journal of Urology 149:65 3 -65 8) with no evidence of any 
protease activity while as little as 0.5 ng of bovine trypsin gave detectable protease 
activity.. 5mg of fusion protein did not cleave the chromogenic trypsin substrate BAEE 
while Img of trypsin gave consistently positive results. 
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Production of Baculovirus Recombinant Protein 

50mg of lysates prepared from sf9 cells infected with an amplified stock 
of Protease M. recombinant baculovirus were run on a western blot and probed with anti 
Protease M antibody (Figure 7). While no reacting proteins were seen in the lysate from 
uninfected sf9 cells, a protein of 39kd was detected in lysates of sf9 infected with 
recombinant baculovirus. Sf9/lG3(Schachter M. (1980) Kallikreins (kininogenases) 
Pharmacol. Rev. 31; 1 -1 7) had approximately 50% infected, enlarged cells while 
sf9/lG3(Reigman PH, Vlietstra RJ, Suurmeijer L, et al. (1992) Genomics 14:6-1 1 .) 
which was infected with 5 times more virus had nearly 100% infected cells. However, 
the amount of recombinant protein was quite low and we were unable to detect a band of 
39kd on commasie blue stained gels (not shown). 

The best purification protocol for purification of recombinant Protease M 
from lysates was p-aminobenzamidine agarose affinity chromatography followed by 
concanavalin A agarose chromatography. Using this protocol, recombinant Protease M 
was purified approximately 80-fold. However, the protein was still only 10% pure 
Oudging from silver-stained gels) and the yield was calculated to be less than Img/lO^ 
cells. Using this data we were able to calculate that 50 mg of lysate contains 15 ng of 
Protease M or 0.03 % of the total protein. Furthermore, by comparing the amount of 
the 39kd band determined on silver stained gels of the 80-fold purified Protease M, with 
western blots of the purified protein, we were able to determine that the antibody can 
detect 5 ng of Protease M protein as a lower limit. Up to lOOmg of lysate or 100 ng of 
80-fold purified Protease M (containing approximately 10 ng of Protease M) was run on 
on gelatin and casein zymograms and failed to detect protease activity (not shown). Up 
to 0.5 ng of trypsin run in parallel lanes was detected: 

MDA-MB435 Transfectants 

A pCMV/Neo/Protease M construct as well as neo-vector controls were 
transfected into MDA-MB435 cells (5 x 10^ cells for each construct) by electroporation. 
Of the 106 cells which survived the electroporation, approximately 400 colonies (one in 
2,500) survived the G41 8 selection. 80 colonies of protease transfected clones and 20 
colonies of vector transfected clones were transferred to 24 well dishes when they were 
2 to 3 mm in diameter. The protease transfected cells grew more slowly and had more 
enlarged, dying cells than the vector controls. Total cell lysates were prepared from the 
76 protease transfectants when the cells were approximately 70% confluent. Western 
blots, prepared from 50mg of the lysate from the 76 transfectants as well as 50mg of 
lysate from 70N (positive control), were probed with the Protease M antibody. Only 2 
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protein, (data not shown). Furthermore, the level of proteiii expressed by these positive 
clones, was, in all cases, considerably less than in 70N ceils; 

Table 2 shows that Protease M RNA was found in clones expressing 
protein as well as the majority of those not expressing protein. Thus, inMDA-MB-435 
5 cells there is either inefficient translation of the message, or the protein translated is 
extremely unstable. 

Table 1 Shows the Expression of Protease M mRNA and protein in mammary cells 
10 . Cell Line RNA* Protein^ 



70N * 


' • ' ■ 5 • 


100 


81N . . 


4 • 


60 


16-1-1 (76N/HPV16) 


' 4 


64 


21NT 


85 - ' 


47 


21PT 


100 . • ^ 


76 


MDA.MB435 . 


- 0 


0 


'T47D 


• 0 


0 


MCF-7 


0 


0 



20 ~ - ' - ' ■ :r ^ 

^ RNA values were obtained by running lOmg of total RNA on a northern blot, 
hybridizing to ^^P-labeled Protease M probes and quantitating the resulting 
autoradiograms. The most intense 'band was set equal to 1 00 and the other values 
normalized accordingly ^ protein values were obtained by running 50mg of total cell 

25 lysates on a western blot and probing with the Protease M antibody as described in 
methods. The 37kd bands on the autoradiograms were quantitated, the most intense 
band was set equal to 100 and the other values normalized accordingly. 
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Table 2 Shows the Analysis of Protease M RNA and Protein Expression in MDA-MB- 
435 

Tranfectants 

Cell Line rna' Protein2 



70N 


12 


100 


MDA.MB435 


0 


0 


Protease M transfectant #13 


4 


0 


#19 


10 


0 


#42 


96 


25 


#44 


61 


12 


#53 


0 


0 


#58 


100 


0 


#59 


6 


0 


#64 


22 


0 


#65 


44 


25 


#66 


55 


63 


#75 


22 


0 


#86 


0 


0 



these values were determined as in the footnote to Table 1 . 



Table 3.Shows NORTHERN RESULTS WITH OVARIAN CELL LINES AND TISSUES 

CELL LINES FRACTION OF CELLS 

EXPRESSING 
NORMAL 0/5 
TUMOR 5/8 



TISSUES 

NORMAL 

TUMOR 



0/2 
13/19 
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Table 4. 
CELL LINE 

70N 



DESCRrPnON 

Normal human mammary 
epithelial 



Protease M RNA , 6A2 RNA EXP 
EXP 



76N 



normal human mammary -f 
epithelial 



81N 



norma! human mammary + 
epithelial 



21NT 



primary breast carcinoma -h- to +++ 



•to 



21PT 



primary breast carcinoma i i ii 



1 1 M 



21MT2 



metastatic breast 
carcinoma (pleural 
effusion) 



+ to 



■to 



21MT1 



MDA-MB-157 



MDA-MB 



metastatic breast 
carcinoma (pleural 
effusion) 

breast medulla carcinoma 
(pleural efftision) 
Breast adenocarcinoma 
(pleural effusion) 



MDA-MB-36 1 breast adenocarcinoma 
(brain metastasis) 



MDA-MB-435 breast ductal carcinoma 
(pleural effusion) 
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CELL LINE DESCRIPTION 



Protease M RNA 6A2 RNA EXP 
EXP 



MDA-MB-436 



breast adenocarcinoma 
(pleural effusion) 



BT-474 



breast invasive ductal 
carcinoma (primary) 



BT.549 



breast invasive ductal 
carcinoma (metastasis to 
lymph nodes) 



HS578T 



MCF7 



breast ductal carcinoma 
(primary) 

breast adenocarcinoma 
(primary 



T-47D 



breast ductal carcinoma 
(pleural effusion) 



ZR-75-1 



breast ductal carcinoma 
(ascitic effusion) 



56NF 



normal breast fibroblast 



PC-3 

(CRLI435) 
WiDR 
(CCL218) 
SW48 
(CCL228) 
MIA Pa-CA-2 
(CCL1420) 
HuTu 80 



prostate adenocarcinoma 



colon adenocarcinoma 



colon adenocarcinoma 



pancreatic carcmoma 



duodenal adenocarcinoma 



(+) 
(+) 
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CELL LINE DESCRIPTION Protease M RNA 6A2 RNA EXP . 

EXP 



T24 


bladder transitional cell 




carcinoma 


A549(CCL185) 


lung carcinoma 


Calu-1 


lung epidermoid carcinoma 


Oat 4 


lung small cell carcinoma 


G-361 


malignant melanoma 


SMKE 30 


malignant melanoma 


A2058 


malignant melanoma 


SCC-25 


tongue squamous cell 




carcinoma 


RD 


rhadomyosarcoma of pelvis 


Kaposi 


kaposis sarcoma 


FS3 


foreskin fibroblast 


Leukocyte 


normal leukocytes 
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TABLE 5. SHOWS RNA EXPRESSION IN MAMMARY TISSUE 



SAMPLE 
81N 

MDA-MB-435 



TYPE 

N cell strain 
T cell line 



Protease MASPIN CX26 
M 



CX43 



1 M I 



CHTN 4253B 
CHTN 4420A 
CHTN 4782B 
CHTN 5075A 



CA 
CA 
CA 
CA 



+/- 

+ 



CHTN 4253A 
CHTN 4303 
CHTN 628 IE 



NAT 
NAT 
NAT 



CHTN 4728A 
CHTN 4760C 
CHTN 5303A 
RM (10/30/87) 
RM-70N 
RM-70N 
RM-83N 



RM 
RM 
RM 
RM 
RM 
RM 
RM 



+ 
+ 



5 EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain using no 
more than routine experimentation, many equivalents to the specific embodiments of the 
invention described herein. Such equivalents are intended to be encompassed by the 
following claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLIC!ANT: 

(A) NAME: DANA-FARBER CANCER INSTITUTE 

(B) STREET: 44 BINNEY STREET 

(C) CITY: BOSTON 

(D) STATE: MASSACHUSETTS 

(E) COUNTRY: US 

(F) POSTAL CODE (ZIP) : 02115 

(G) TELEPHONE: 

(H) TELEFAX: 



(ii) TITLE OF INVENTION: PROTEASE M, A NOVEL SERINE PROTEASE 
(iii) NUMBER OF SEQUENCES: 2 

. (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LAHIVE & COCKFIELD, LLP 

(B) STREET: 28 STATE STREET 

(C) CITY: BOSTON 

(D) STATE: MASSACHUSETTS 

(E) COUNTRY: US 

(F) ZIP: 02109-1875 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.25 

(Vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US97/ 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 60/025,301 

(B) FILING DATE: 13 SEPTEMBER 1996 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: MANDRAGOURAS , AMY E. 

(B) REGISTRATION NUMBER: 36,207 

(C) REFERENCE /DOCKET NUMBER: DFN-009PC 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617)227-7400 

(B) TELEFAX: (617)742-4214 



(2) INFORMATION FOR SEQ ID NO : 1 ; 
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CTG GAG AGG GAG TGC TCA GCC AAC ACC ACG AGC TGC .CAC ATC CTG GGC 6 71 

Leu Giu Arg Asp Cys Ser Ala Asn Thr Thr Ser Cys His lie Leu Gly 
130 135 140 

5 

TGG GGC AAG ACA GCA GAT GGT GAT TTC CCT GAG ACC ATC CAG TGT GCA 719 
Trp Gly Lys Thr Ala Asp Gly Asp Phe Pro Asp Thr lie Gin Cys Ala 
145 150 155 

10 TAG ATC CAC CTG GTG TCC CGT GAG GAG TGT GAG CAT GCC TAG' CCT GGC 76 7 

Tyr lie His Leu Val Ser Arg Glu Glu Cys Glu His Ala Tyr Pro Gly 
160 165 170 

CAG ATC ACC CAG AAC ATG TTG TGT GCT GGG GAT GAG AAG TAG GGG AAG 815 
15 Gin He Thr Gin Asn Met Leu Cys Ala ■ Gly Asp Glu Lys Tyr Gly Lys 
175 180 185 190 

GAT TCC TGC CAG GGT GAT TCT GGG GGT CCG CTG GTA TGT GGA GAC CAC 863 
Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Leu Val Cys Gly Asp His 
20 195 200 205 

CTG CGA GGC CTT GTG TCA TGG GGT AAC ATC CCC TGT GGA TCA AAG GAG 911 

Leu Arg Gly Leu Val Ser Trp Gly Asn He Pro Cys Gly Ser Lys Glu 
210 215 220 

25 

AAG CCA GGA GTC TAC ACC AAC GTC TGC AGA TAG ACG AAC TGG ATC CAA 959 

Lys Pro Gly Val Tyr Thr. Asn Val Cys Arg Tyr" Thr Asn Trp He Gin 
225 230 235 

30 AAA ACC ATT CAG GCC AAG T GACCCTGACA TGTGACATCT ACCTCCCGAC " 1008 - 

Lys Thr He Gin Ala Lys 
240 



35 



CTACCACCCC ACTGGCTGGT TCCAGAACGT CTCTCACCTA GACCTTGCCT CCCCTCCTCT 1068 

CCTGCCCAGC TCTGACCCTG ATGCTTAATA AACGCAGCGA CGTGAGGGTC CTGATTCTCC 112 8 

CTGGTTTTAC CCCAGCTCCA TCCTTGCATC ACTGGGGAGG ACGTGATGAG TGAGGACTTG 118 8 

40 GGTCCTCGGT CTTACCCCCA CCACTAAGAG AATACAGGAA AATCCCTTCT AGGCATCTCC 1248 

TCTCCCCAAC CCTTCCACAC* GTTTGATTTC TTCCTGCAGA GGCCCAGCCA CGTGTCTGGA 13 08 

ATCCCAGCTC CGCTGCTTAC TGTCGGTGTC CCCTTGGGAT GTACCTTTCT TCACTGCAGA 136 8 

45 

TTTCTCACCT GTAAGATGAA GATAAGGATG ATACAGTCTC CATCAGGCAG TGGCTGTTGG 142 8 

AAAGATTTAA GATTTCACAC CTATGACATA CATGGGATAG CACCTGGGCC GCCATGCACT 14 8 8 

50 CAATAAAGAA TGTATTTTAA AAAAAAAAAA AAAAAAAA 1526 

(2) INFORMATION FOR SEQ ID NO:2: ' 
55 (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 244 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRXPTION-. SEQ ID N0:2:. 

Met Lys Lys Leu Met Val Val Leu Ser Leu lie Ala Ala Ala Trp Ala 
1 5 10 15 

Glu Glu Gin Asn Lys Leu Val His Gly Gly Pro Cys Asp Lys Thr Ser 

.20 25 . -30 

His Pro Tyr Gin Ala Ala[ Leu Tyr Thr Ser Gly His Leu Leu Cys Gly 
35 40 45 . 

Gly Val Leu He His Pro Leu Trp Val Leu Thr Ala Ala His Cys Lys 

, . . 55 . ,60 

Lys Pro Asn Leu Gin Val Phe Leu Gly Lys His Asn Leu Arg' Gin Arg 
^5 70 75 80 

Glu Ser Ser Gin Glu Gin Ser Ser Val Val Arg Ala Val He His Pro 

'85 ^0 95 . 

Asp Tyr Asp Ala Ala Ser His Asp Gin Asp He Met Leu Leu Arg Leu 
100 105 110 

Ala Arg Pro Ala Lys Leu Ser Glu Leu He Gin Pro Leu Pro Leu Glu 

120 125 

Arg Asp Cys Ser Ala Asn Thr Thr Ser Cys His He Leu Gly Trp Gly 
130 135 140 

Lys Thr Ala Asp Gly Asp Phe Pro Asp Thr He Gin Cys Ala Tyr He 

150 155 160 

His Leu Val Ser Arg Glu Glu Cys Glu His Ala Tyr Pro Gly Gin He 
165 170 175 

Thr Gin Asn Met Leu Cys Ala Gly Asp Glu Lys Tyr Gly Lys Asp Ser 
180 185 190 

Cys Gin Gly Asp Ser Gly Gly Pro Leu Val Cys Gly Asp His Leu Arg 
195 200 205 

Gly Leu Val Ser Trp Gly Asn He Pro Cys Gly Ser Lys Glu Lys Pro 
210 215 220 



Gly Val Tyr Thr Asn Val Cys Arg Tyr Thr Asn Trp He Gin Lys Thr 
225 

He Gin Ala Lys 



230 235 240 



4 



5 
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What is claimed is: 

1. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding Protease M or a biologically active portion thereof. 

2. An isolated nucleic acid molecule comprising the nucleotide sequence of 
SEQIDNO:!. 

3. An isolated nucleic acid molecule at least 15 nucleotides in length which 
10 hybridizes imder stringent conditions to a nucleic acid molecule comprising the 

nucleotide sequence of SEQ ID NO: 1 . 

4. The isolated nucleic acid molecule of claim 1, comprising the coding 
region of the nucleotide sequence of SEQ ID NO: 1. 

15 

5. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a protein, wherein the protein comprises an amino acid sequence at least 60 % 
homologous to the amino acid sequence of SEQ ID NO: 2. 

20 6. The isolated nucleic acid molecule of claim 5, wherein the protein 

comprises an amino acid sequence at least 70 % homologous to the amino acid sequence 
of SEQ ID NO: 2 

. 7. The isolated nucleic acid molecule of claim 5, wherein the protein 
25 comprises an amino acid sequence at least 80 % homologous to the amino acid sequence 
of SEQ ID NO: 2. 

8. The isolated nucleic acid molecule of claim 5, wherein the protein 
comprises an amino acid sequence at least 90 % homologous to the amino acid sequence 

30 of SEQ ID NO: 2. 

9. An isolated nucleic acid molecule encoding the amino acid sequence of 
SEQ ID NO: 2. 



35 



10. 



An isolated nucleic acid molecule encoding a Protease M fusion protein. 
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11. An isolated nucleic acid molecule which is antisense to the nucleic acid 
molecule of claim 1. 

12. -The isolated nucleic acid molecule of claim I which is antisense to a 
5 coding region of the coding strand of the nucleotide sequence of SEQ ID NO: 1 . 

13. The isolated nucleic acid molecule of claim 1 which is antisense to a 
noncoding region of the nucleotide sequence of SEQ ID NO: 1. 



10 



14. The isolated nucleic acid molecule of claim 1 isoloated using at least a 
portion of the nucleotide sequence of SEQ ID NO: 1 as a probe or a primer. 



15. A vector comprising a nucleotide sequence encoding Protease M. 

1 ^ 16. The vector of claim 15, which is a recombinant expression vector. 

1 7. The vector of claim 1 6, which encodes a protein comprising the amino 
acid sequence of SEQ ID NO: 2. 

20 18. The vector of claim 1 5, which comprises the coding region of the 

nucleotide sequence of SEQ ID NO: 1. 

19. A host cell containing the vector of claim 17. 

25 20. A host cell containing the recombinant expression vector of claim 18. 

21. A method for producing Protease M comprising culturing the host cell of 
claim 19 in a suitable medium until Protease M is produced. 

22. The method of claim 2 1 , further comprising isolating Protease M from 
the medium or the host cell. 

23 . An isolated Protease M protein or a biologically active portion thereof 

35 24. An isolated Protease M protein, wherein said protein is encoded by the 

nucleic acid shown in SEQ ID No:l. 



V 
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25. An isolated protein which comprises an amino acid sequence at least 60 
% homologous to the amino acid sequence of SEQ ID NO: 2. 

26. An isolated protein which comprises an amino acid sequence at least 70 
5 % homologous to the amino acid sequence of SEQ ID NO: 2. 

27. An isolated protein which comprises an amino acid sequence at least 80 
% homologous to the amino acid sequence of SEQ ID. NO: 2. 

10 28. An isolated protein which comprises an amino acid sequence at least 90 

% homologous to the amino acid sequence of SEQ ID NO: 2. 

29. An isolated protein comprising amino acids 22-244 of SEQ ID NO: 2. 

1 5 30. A pharmaceutical composition comprising the protem of SEQ ID No:2 or 

biologically active portion thereof and a pharmaceutically acceptable carrier. 

31. A fusion protein comprising a Protease M polypeptide operatively linked 
to a non-protease M polypeptide. 

32. An antigenic peptide of Protease M comprising at least 8 amino acid 
residues of the amino acid sequence shown in SEQ ID NO: 2, the peptide comprising an 
epitope of Protease M such that an antibody raised against the peptide forms a specific 
immune complex with Protease M. 

33. An antibody that specifically binds Protease M. 

34. The antibody of claim 33, which is monoclonal. 

30 35. The antibody of claim 34, which is coupled to a detectable substance. 

36. A pharmaceutical composition comprising the antibody of claim 34 and a 
pharmaceutically acceptable carrier. 

35 37. A nonhuman transgenic animal which contains cells carrying a transgene 

encoding Protease M. 



I 

f 

1 

V- 
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SS. A nonhuman homologous recombinant animal which contains cells 
having an altered Protease M gene. 

39. A method for detecting the presence of Protease M in a biological sample 
5 comprising contacting a biological sample with an agent capable of detecting Protease 

M protein or nucleic acid. 

40. The method of claim 39, wherein the agent is a labeled or labelable 
nucleic acid probe capable of hybridizing to Protease M nucleic acid. 

10 

4 1 . The method of claim 40, wherein the agent is a labeled or labelable 
antibody capable of specifically binding to Protease M protein. 

42. The method of claim 40, wherein the biological sample is a tumor 

15 sample. 

43. The method of claim 40, wherein the tumor sample is a manlmary tumor 
sample. 

44. - A kit for detecting the presence of protease M in a biological sample 
comprising a labeled or labelable agent capable of detecting protease M protein or 
nucleic acid in a biological sample; means for determining the degree of binding to the 
sample; and means for comparing the amount of amount of binding to the sample with a 
standard. 

45. The kit of claim 44, wherein the agent is a nucleic acid probe capable of 
hybridizing to protease M nucleic acid. 

46. The kit of claim 44, wherein the agent is an antibody capable of 
30 specifically binding to protease M protein. 

47. A method comprising contacting a cell with an agent that modulates 
protease M serine proteinase activity associated with the cell. 



20 



25 



35 



48. The method of claim 47, wherein the agent stimulates the protease M 
cysteine proteinase inhibitory activity associated with the cell. 
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49. The method ofclaim 47, wherein the agent' inhibits the protease M serine 
proteinase activity associated with the cell. 

50. The method of claim 48, wherein the agent is an active protease M 

5 protein. 

5 1 . The method of claim 48, wherein the agent is a nucleic acid encoding 
protease M that has been introduced into the cell. 

10 52. The method of claim 49, wherein the agent is an antisense protease M 

nucleic acid molecule. 

55. The method of claim 49, wherein the agent is an antibody that 
specifically binds to protease M. . . . * 

15 

56. The method of claim 47, wherein the cell is present within a subject and 
the agent is administered to the subject. , . > 

57. A method for inhibiting development or progression of a metastatic 
20 phenotype in a tumor cell comprising contacting the tumor cell with an agent which 

modulates the amount of or activity of protease M in or around the tumor cell. 

58. The method of claim 57^. wherein the agent is protease M, 

25 59. The method of claim 57, wherein the agent is a nucleic acid encoding 

protease M that has been introduced into the tumor cell. 

60. The method of claim 57, wherein the agent is a nucleic acid antisense to 
protease M that has been introduced into the tumor cell. 

30 

61 . The method of claim 57, wherein the tumor cell is a mammary tumor 



cell. 
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62. A method for identifying a modulator of the serine protease activity of 
protease M, comprising 

incubating protease M, a serine, protease, a substrate for the serine 
protease and a test substance under conditions suitable for the serine protease to cleave 
the substrate; 

measuring the cleavage of the substrate; 

comparing the amount of cleavage of the substrate in the presence of the 
test substance to the amount of cleavage of the substrate in the absence of the test 
substance; and 

identifying the test substance as a modulator of the serine protease 
inhibitory activity of protease M. 

63. A method for identifying a modulator of protease M expression, 
comprising 

contacting a cell with a test substance; 

determining the level of expression of protease M mRNA or protein in 

the cell; 

comparing the level of expression of protease M mRNA or protein in the 
cell in the presence of the test substance to level of expression of protease M mRNA or 
protein in the cell in the absence of the test substance; and 

identifying the test substance as a modulator of protease M expression. 
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PROTEASE M SEQUENCE 



67 

134 

201 



a045 
13.12 
1179 
1246 
1313 
13 BD 
1447 
1514 



5 

1 


A J. O 

Met 


AAG 
Lys 


AAG 
Lys 


CTG 
Leu 


ATG 
Met 


GTG 
Val 


GTG 
Va2 


CTG 
Leu 


37 
IB 


Glu 


(_JiG 

Gin 


AAT 
Asn 


AAG 
Lys 


TTG 
■ Leu 


GTG 
Val 


CAT 
His 


GGC 
Gly 

TCG 
Sex 


■SB 
35 


TTAC 
Tyx 


CAA 
Gin 


GCT 
Ala 


GCC 
Ala 


CTC 
Leu 


TAC 
Tyr 


ACC 
Thr 


59 
52 


ATC 
lie 


CAT 


Pro 


CTG 
Leu 


TGG 
Trp 


GTC 
Val 


CTC 
Leu 


ACA 
Thr 


€9 


Gin 


GTC 
Val 


TTC 

Phe 


CTG 
Leu 


GGG 
Gly 


AAG 
Lys 


GAT 

'His 


AAC 
Asn 


:T)1 

B6 


CAG 
Glxi 


AGT 
5er 


TCT 
Sex 


GTT 
Val 


GTC 
Val 


.CGG 
Arg 


GCT 
Ala 


GTG 
Val 


352 
-D3 


CAT 
ULs 


GAC 
Asp 


CAG/ 
Gln^ 


^AC 


\atc 

/lie 


ATG 
Met 


CTG 
Leu 


TTG 
Leu 


3D3 


GAA 
Gill 


CTC 
Xteu 


ATC 
lie 


CAG 
Gin 


CCC 
Pro 


CTT 
Leu 


CCC 
Pro 


CTG 
Leu 


"54 
-37 


AGC 


TGC 
Cys 


CAC 
Kis 


ATC 
lie 


Ci-G 
Leu 


GGC 
Gly 


TGG 
Trp 


GGc" 
Gly 


7X55 
154 


ACC 
Tiax 


ATC 
lie 


CAG 
Gin 


TGT 

Cys 


GCA 
Ala 


TAC 
Tyr 


ATC 
He 


CAC 
Kis 


75G 
171 


GCC 
-Ala 


TAC 
Tyr 


CCT 
Pro 


GGC 
Gly 


CAG 
Gin 


ATC 
lie 


ACC 
Thr 


CAG 
Gin 


BD7 
IBS 


TAC 
Tyx 


GGG 
Gly 


AAG 
Lys 


[gat 

Asp 


TCC 
Sex 


TGC 

Cys, 


CAG 
Gin 


GGT 
Gly 


B5B 
205 


GAC CAC 
-Asp fiis 


CTC 
Leu 


CGA 
Arg 


GGC 
Gly 


CTT 
Leu 


GTG 
Val 


TCA 
Ser 


909 
222 


GAG 
Glia 


AAG 
Lys 


CCA 
Pro 


GGA 
Gly 


GTC 
Val 


TAC 
Tyr 


ACC 
Thr 


AAC 
Asn 


S6D 
239 


AAA 


ACC 
Tlix 


ATT 
lie 


CAG 
Gin 


GCC 
Ala 


AAG 
Lys 


977 
244 



AGT 
Ser 



CTG 
Leu 



ATT 
He 



GCT GCA GCC 
Ala Ala Ala 



GGA 
Gly 

GGC 
Gly 

GCT 
Ala 

CTT 
Leu 

ATC 
■lie 

CGC 
Arg 

GAG 
Glu 

AAG 
Lys 

CTG 
Leu 

Asn 

GAT/ 
AspV 

TGG 
Trp 

GTC 
Val 



TGG GCa'^'gaG 
Trp Ala Glu 



CCC 
Pro 

CAC 
His 

GCC/ 
Ala^ 

CGG 
Arg 

CAC 
His 

CTG 
Leu 

AGG 
Arg 

ACA 
Thr 

GTG 
Val 

ATG 
Met 



TGC 
Cys 

TTG 
Leu 



GGT 
Gly 

TGC 
Cys 



CAA 
Gin 

CCT 
Pro 

GCA 
Ala 

GAC 
Asp 

GCA 
Ala 

TCC 
Ser 

TTG 
Leu 

NGGG 
/Gly 

AAC 
Asn 

AGA 
Arg 



GAC AAG ACA 
Asp Lys Thr 

CTC TGT. GGT 
Leu Cys Gly 

\tGC AAA AAA 
/Cys Lys Lys 

AGG GAG AGT 
Arg Glu Ser 

GAC TAT GAT 
Asp Tyr Asp 

CGC CCA GCC 
Arg Pro Ala 

TGC TCA GCC 
Cys Ser Ala 

GAT GGT GAT 
Asp Gly Asp 

CGT. GAG GAG 
Arg Glu Glu 

TGT GCT GGG 
Cys Ala Gly 

GGT CCG CTG 
Gly Pro Leu 

ATC' CCC TGT 
He Pro Cys 

TAC ACG AAC 
Tyr Thr Asn 



TCT CAC CCC 
Ser His Pro 

GGG GTC CTT 
Gly Val Leu 

CCG AAT CTT 
Pro Asn Leu 

TCC CAG GAG 
Ser Gin Glu 

GCC GCC AGC 
Ala Ala Ser 

AAA CTC TCT 
Lys Leu Ser 



AAC ACC 
Asn Thr 



ACC 
Thr 



TTC CCT 
Phe Pro 

TGT GAG 
Cys Glu 

GAT GAG 
Asp Glu 

GTA TGT 
Val Cys 

GGA TCA 
Gly Ser 

TGG ATC 
Trp He 



GAC 
Asp 

CAT 
His 

AAG 
Lys 

GGA 
Gly 

AAG 
Lys 

CAA 
Gin 



AAAAAAAAAT^ 1526 " ^^^^^^^^^^^^^^^^^I^AAGAATGTATTTTAA^^ 



296 
17 

347 
34 

51 

449 
6B 

500 
85 

551 
102 

602 
119 ■ 

653 

'136 

704 

-153 

755 
170 

806 
187 

657 
204 

908 
221 

959 
238 



1044 
1111 
1178 
1245 
1312 
1379 
1446 
1513 



3/8 



V-D VO VX> CO to 



0> CT> O VD 



0\ 0^ ^> «^ Wi 
> cv» r-a 
04 r>« CM O* CM 



« ^ fM (— 
> «~0 VO 

» o* rM CN* rvj 



o 



t3. 



t=3 

O ^ 



t=3 

CO S 



o 
a 





Ci q o o| 




J »-l J > J 




3C 3: 3= 


1 1 


J > t-l > 


o o o(3]o 


tn CO t/j 


1 1 


• 3= s a. 


a. a* oc ex. uj 


CO cn >-. 


1 1 


J J J »-< < 


3= 3= ac jcjz 




3 o a 




Q O Q 


> S> > 1-4; 


CO o u 






>• > > < J 


J «3 J J J 




Q CJ} a 






O O CJ tJ C-> 


> > H-i >pl 




a. cu ^ 


1 1 






O U CD O CJD 


ce « o 




• 




O LD O O 


O ^ (X 


1 1 




CO a E-i CO 


O o O O 


CO Li- ^ 


1 1 




a: u:: o o o 


= > o-J 






1 t 


- 


2.1 On. 


-a; Du -a = 






o = 




DC b- U. >• 


3= oi E-t rc >- 




i£ ti3 a 




-3 -3 co-ta Q 




. .1 




=B 

CO t- 




u-u-.u-.tr> 


=: en Cju crj CO 




^ ^ 




CU ti3 =r O C3 


ac =5 




CD CO CO 


^ ui 




U U3 Ci3 1 CO 


. 










EX, Cu £X. 1 CO 


S> > -3 J »J 




~ a 


o o 




LQ tjD ^ 


^ < cfj 










H-i w6- - 






^ J o 






CO CO CO t-i n 


o* o o oc 




:^ a. 


a. o- 


• 


:d tj> C3 C5 C3 


s s= s= >- >- 












^' CU Ohi CVa OU 




CI. a. a. 


t— t cc 


• 


CD CJ5 C3 CD C3 


ooocE :> 


tJU U. 






:r> cn cofj|co 



to cu z w z 

a* >i 
J J H CO o 
*< O CD *< 
o o o o CJ) 
CL. cu cl.)o 



to W > I 

a* CO >■ s o 



e> CD C3 CD C3 



CO CO cn CO CO 



CD CD CD CD CD 
^ ^ ^ 



d)q[cd] 



=: 3 Q CD S - 




CJ O CJ CJ CJ| 

=> >g]>>^ 

U ^ J j[> 
CL. CU Ol4 CU 

CD CD CD CD 
CD CD CD CD CD 
CO CO CO CO CO 
Q Q Q Q.Q 
CD CD CD CD CD 



O CO > o» O 
»| 0 O CJ CJ Cjj 
6-« CO CO 
4-^CO|Q Q O 

^ 

CD CD CD O CD 
CD CD CdJ >^C3 



t-* E-» UJ ixi UJ 

S 3 -3 to .J3 
J K DC O 
|CJ CD CD CD 0| 

<c <c > *s: > 



CL. CU CO I CO 



CJ CJ C-> CJ CJ 

^ «a .js ^ 
s: s =: ac 3: 



E-t E-i e-* 




a o o 

CO o(:><s 



^ in ^ 1 Q. 
ro ex ro ^ 

^ O 1- 



^ tn ^ I o. 

•O CJ 



: O I-< 



^ O 
o o. 



UJ O C* CD CD 
CO CU >. CU O- 

>- = a= >^ 
-t; > < -5 to . 
ec Oie: = *?; 
*< ■< tu CO t o 
» jcj CJ CJ CJ cjj 
a: > uj cu i*; 
:-a Q ra fajj ^a; 
= = g CC Qt 
|cn CO| CU [CO co| 

^ m ^ e — • 
^ w ^ 1 cx 
ro o. m •*-> =^ 
^ o U 



og Pg > El > 



> > J CI >^ 
|> > > 5> S>| 




<: CQ CO CD CD 

[OU o* a* OU ou| 
^ ta ^ e ^ 

fO o. -«-> 

-v: o U 

f t; 



try 



TO 



«3 



TO 



ra 



5/8 



FiG.4B 

HORMAL TimOR 

w ^ :^ ,« 

5 5 X .i >d 



6/8 



FIG. 5 



[v-^ jC^j :^ ifJ-J 

0-;i O"^ CD: 

fff> evv iV> (£rsi c^'^/ 



I 



VV ^WiLA^MfW 

7/8 




VV A. 



8/8 




4 • 



I PCT/US 97/16175 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 C12N15/57 C12N15/62 C12N9/64 C12N15/11 
A61K38/48 CG7K16/40 A61K39/395 AO1K67/027 
C12Q1/37 C12Q1/68 GG1N33/574 A61K31/7G 

According to fntematianal Patent Ctasstfication (IPC) or to both national olassifieatton and IPC 


C12N5/1G 

C12N15/G0 

A61K48/GG 


B. FIELDS SEARCHED 


MinimuRi documentation eearehed (eiassiftoa&in system followed by ciasaifioation symbols) 

IPC 6 C12N 


Documentation searched other ttian mininmim documentation to the extent that such documents are included in 


the fields searched 



Electronio data base consulted during the international search (name of data base and, where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



2 



Category * 


Cttatton of document, with indication, where appropriate, of the relevant passages 


Relevant to claim No. 


X 
Y 

Y 


EP 0 576 152 A (LILLY CO ELI) 29 December 
1993 

see the whole document 

VIHINEN M: "Modeling of prostate specific 
antigen and human glandular kallikrein 
structures." 

BIOCHEM BIOPHYS RES COMMUN, NOV 15 1994, 

204 (3) P1251-6, UNITED STATES, 

XP0O206GG74 

see the whole document 

-/-- 


1-9, 

14-29, 

32-35, 

39-41, 

44-46 

lG-13, 

3G.31, 

36-38, 

42,43, 

47-63 

1-9, 

14-36, 

39-63 


1 ^ Further doourmnts are listed in the continuation of tiox 0. 


y Patent family memtiers are listed in annex. 


• Sp-oial eategorie. of oited ttaouments : ^ dooument published aterthe jntemaBonal filing date 
^ . . , . ..... or priority date and not in conflict with the appiication but 

•A" document defining the general state of the art whnsh is not ^jitej jo understand the principle or theory underiying the 

considered to be of particular relevance invention 
'E* earlier document but published on or after the international .5^ document of particular relevance; the claimed inventbn 

filing date cannot t>e considered novel or cannot t>e considered to 
"L' document whioh may throw doubts on priority olaim(8) or involve an inventive step when the document is taken abne 

which is cited to establish the publication date of another document of particular relevance; the claimed invention 

citation or other special reason (as specified) cannot be considered to involve an inventive step when the 
'O" dooument referring to an oral disclosure, use, exhBsition or dooument is comt»ned wrtti one or more other such docu- 

other means ments, such combination being obvious to a peraon skilled 

■P" document published prior to the intemational filing date but 

later than the priority date claimed document member of the same patent family 


Date of the actual completion of the international search 

24 March 1998 


Date of mailing of tfte intemational search report 

0 7. 0*1. 98 


Name and mailing address of the ISA 

European Patent Office, P.B. 5818 Patentlaan 2 
NL • 2280 HV Rijswi^ 


Authorized officer 



2 



1 PCT/US 97/16175 


C.{ContinL 


lation) DOCUMENTS CONSIDERED TO BE RELEVANT 




Citation of docurront, with indication iWhare ajroropriate, of the retevant passages 


Relevant to claim No. 


Y 

Y 
Y 

Y 

P.X 


HAGEN HE ET AL: "Simulium damnosum s.l.: 
identification of inducible serine 
proteases fonowing an Onchocerca 
infection by differential display reverse 
transcription PCR. " 

EXP PARASITOL, NOV 1995, 81 (3) P249-54, 
UNITED STATES, XPG02O6OG75 
see the whole document 

EP 0 652 014 A (NAT INST IMMUNOLOGY) 10 

May 1995 

see claims 1-10 

WO 96 26280 A (BASF AG ;KAMENS JOANNE 
(US); ALLEN HAMISH (US); PASKIND MICHAEL 
(U) 29 August 1996 
see abstract; claims 23,24,28 

WO 90 05188 A (PHARMACEUTICAL PROTEINS 

LTD) 17. May 1990 

see abstract; claims 14-17 • 

ANISOWICZ A ET AL: "A novel protease 
homolog differentially expressed in breast 
and ovarian cancer" 

MOLECULAR MEDICINE (CAMBRIDGE), 2 (5). 30 
SEP 1996. 624-636., XP002060G76 
see the whole document 


1-9. 
14-29, 
32-35, 
V 39-46 

30,36, 
47-63 

10-14,31 

10.31, 
37,38 

1-9, 

14-29, 

32-35, 

39-41, 

44-46 



4b • 



INTERNATIONAL SEARCH REPORT 



PCT/US 97/16175 



Box I Observations where certain claims were found unsearchable (Continuation of item 1 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 

t. fx! aaiins Noa^ . ^ 

' because they relate to subfec f ma tter not required to be seaiulied by this^ Authority, narnefy: - 

see FURTHER rNFORITATrOM sheet PCT/ISA/21G 



2. I Claims Nos.: 

because they relate to parts of the International Application that do not comply with the prescribed requirements to such 

an extent that no meaningful International Search can b& carried out, specifically: 



3. Oaims Nos.: 

because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box ii Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: , 



1 . r I As all required additional search foes were timely paid by the applicant, this International Search Report covers all 
' ' searchable claims. 

2- I I As ail searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
of any additional fee. 



3. I I As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
' * covers only those claims for which fees were paid, specifically claims Nos.: 



4. I I No required additional search fees were timely paid tjy the applicant Consequently, this International Search Report is 
restricted to the invention first merrtioned in the claims; it is covered by claims Nos,: 



Remark on Protest | [ The additional search fees were accompanied by the applicant's protest. 

I I No protest accompanied the payment of additional search fees. 



International Application No. PCT/US 97/16175 

FURTHER l^fFORMA'^ON CONTINUED FROM PCT/3SA/ 210 



Renark : Although claim 55 and claims 47-55, 57-61. as. far as'they 
concern an in vivo method ,are directed to a method of treatment of the 
hLtman/ammal body , the search has been carrted out and based on the 
alleged effects of the compound/composition. 



t ^ 



Information on patent family members 



PCT/US 97/16175 



Patent document 


Publication 




Patent family 


Publication 


cited in search report 


date 




member(s) 


date 


CD nc;7Aif;9 a 


^7 


AH 




02-12-93 - 






BR 


93G2075 -A • 


30-Mr93 . 






CA 


2096911 A 


29-11-93 






CZ 


9300982 A 


16-02-94 






HUi 


69612. A. 


28T09.-95 






JP 


6062855 A 


. 08-03-94 






MX 


9303082 A 


30-06-94 






NO 


931889 A 


29-ii-93 






PL 


299053 A 


21-02-94 



EP 


0652014 


A 


10-05-95 


NONE 




WO 


9626280 


A 


29-08-96 


NONE 




WO 


9005188 


A 


17-05-90 


AT 158817 T 


15-10-97 



AU 528101 B 10-09-92 

AU 4494389 A 28-05-90 

DE 68928353 D 06-11-97 

EP 0395699 A 14-11-90 

JP 3505674 T 12-12-91 

US 5550503 A 22-07-97 



I 




